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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

1 0 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (whereinn = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 
1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1 786 and 3 573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
1 5 specific domain or truncation of the peptides encoded by SEQ ID NO: 1-1 786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQIDNO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO:l-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PGR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DN A or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573- 
5 3 5 8 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO:l-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l -1 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO: 1-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypqptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO:l-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
equivalents"thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered ceils (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical^ acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 



WO 01/53312 PCT/USOO/34263 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 

5 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5*-AGT-3* binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The tenn "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embiyogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-Iike or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to detennine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 

1 0 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1 989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

35 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be folly 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
folly matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a foil match (l-*4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism, A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 1 50 amino acids and most preferably less than 1 00 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term n variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, le. y conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 

10. 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g. 9 microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. colU will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 

11 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (eg., receptors) from the cell in which they are expressed. 
20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. etal. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 raM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

3 5 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), S5°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transfonnation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment/' UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 



4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: M 786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO: I - 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulh>like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDN A sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

10 be o btained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:M786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 -1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO:M786 and 3573-5358 may be used as the 

15 basis for suitable primer(s)that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO: 1-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO:l-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FAST A version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

1 5 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2; 1 83 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1 982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et aL, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SBQ IDNO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A poJynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences ofSEQ ID NO:M786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:M786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 
pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

1 0 Enzymology 1 85, 537-566 (1 990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis y Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means {e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

15 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al., Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5* and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5* and 3' untranslated regions). 

. Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1 -1 786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethyIammomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-armno-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (/.*., KNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 

1 0 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. , 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2-o-methylribonucleotide (Inoue et al 
(1987) Nucleic Acids Res 15: 613 1-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 

25 FEES Lett 21 5: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 

30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 

35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. 
No. 4,987,071 ; and Cech et al U.S. Pat. No. 5,1 1 6,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g., Bartel et al, (1993) Science 261 :141 ] -141 8. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al (1992) Ann. N. Y. Acad Set 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et ah (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or M PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Peny-O'Keefe et al (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g. , inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al (1 996), above; Peny-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5'-(4-methoxytrityI)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5 f PNA segment and a 3' 

10 DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem 

Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

1 5 cell membrane (see, e.g., Letsinger et al, 1 989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiabie marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 
1 0 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can-be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (19S6)). The host cells containing one of the 
15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORP) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
KNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the S V40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or WNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and IntemationaJ Application No. 
PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO: 1-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at . 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et ah, Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., I Amer. 

15 Chcm. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 

25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are folly secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 

30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual, Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
1 0 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO: 1787-3572 and 5359-7144. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, eg., Invitrogen, San Diego, Calif, U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed. " 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoafiinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, NJ.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g. , silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 

as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 

may exhibit improved properties such as activity and/or stability. Examples of moieties which 

may be fused to the polypeptide or an analog include, for example, targeting moieties which 

provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 

1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 

fused to the polypeptide include therapeutic agents which are used for treatment, for example, 

immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 

alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol, Vol. 6, pp. 2 1 9-23 5 (1 999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doo little hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 ( 1 982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al, J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 

correspond to all or a portion of a protein according to the invention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protein according to the 

invention. In another embodiment, a fusion protein comprises at least two biologically active 

5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 

linked" is intended to indicate that the polypeptide according to the invention and the other 

polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 

C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

1 0 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DN A techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-&ame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex v/vo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

3 0 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DN A sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operati vely linked to the desired protein encoding sequences. See, for example, PCT International 
Publication No. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
10 International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhanceror both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. Hie identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,07 1 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet aL; and International AppIicationNo. PCT/US90/06436 

15 (WO91/06667)by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic aniihals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 
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4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
•determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1 , BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et ah, J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6. 1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et aL, Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger etal., Eur. J. Immun. 11:405-411, 1981;Takai et al„ J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt~3 ligand (Flt- 
3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 
inflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

41 



WO 01/53312 PCT/US00/34263 
layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173482, (1991); Klug et al, J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad, Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 

5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 

10 Wiley-Liss, Inc., New York, N.Y. 1 994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by EagJstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens -Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al, Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad Sci USA, 89; 1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
1 5 compositions of the invention on the development of that disease. 

Blocking antigen function'may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and.autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or N2B hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

15 MHC class I alpha chain protein and 02 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1 , B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992, 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:111-122, 1994; Galy etal., Blood 85:2770-2778, 1995;Tokietal., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

1 0 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

15 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321 :779-782, 1986; Vale et al., Nature 

20 321 :776-779, 1986; Mason et al., Nature 3 1 8:659-663, 1 985; Forage et al„ Proc. Natl. Acad. Sci. 
USA 83:3091^3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
3 5 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 ML Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 
5 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
10 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
15 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of Hie female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable earner for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 92 1-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1 999) and Li et al., 

25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 



4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
axe also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-Iigand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et ah, Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et aL, J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et ah, J. Exp. Med. 169:149-160 1989; Stoltenborg et aL, J. Immunol. Methods 

175:59-68, 1994; Stitt et aL, Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 



4.10.13 DRUG SCREENING 
30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transfonned cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (I) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosoma] peptides, and (non-naturally occuning) variants thereof. For a 
review, see Science 282:63-6& (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al, Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):114-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

1 0 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e„ increase or decrease) biological activity of a polypeptide of the invention. 

15 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-L Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

1 5 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic my legenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al, 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

1 5 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

3 0 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinaiing disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 ( assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 

30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or components); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 



4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 



4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
1 5 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradennally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 



4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention wall normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about O.Oljig/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.ljig/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 



15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art The term 
"pharrnaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredients). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL~7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-0), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents), A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a proteia of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or antithrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician^ provide maximal therapeutic benefit 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional maimer. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 

*1 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and stmcturally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
5 herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
1 0 ingredient of the present invention with which to treat each individual patient. Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient' s response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
1 5 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ng to about 100 mg (preferably about 0.1 ng to about 10 mg, more preferably 
about 0.1 \ig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
20 topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
35 compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
1 0 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

1 5 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 polyethylene glycol), polyoxyethyiene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ot and TGF-P), and 

3 0 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue {e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

1 0 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the IC 5 o as determined in cell culture {i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for detemiining the LD S0 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in foraiulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED S0 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in 'The 
10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
1 5 bioassay s can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 1 0-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 jig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ng/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 



4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b> Fab' and 

1 0 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG2, and others* Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 1 0 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1 981 , Proc. Nat. Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1 982, J. 
Mol. Biol. 1 57: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
1 0 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1 988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

1 S 5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal ^antibodies (Kozbor, J. Immunol,, 131:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
10 enzyme-linked inimunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized fonns of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab*, F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Rinl 
2:593-596 (1992)). 

5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et aL, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Ban Virus in vitro (see Cole, et aL, 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. MoL Biol.. 227:381 (1991); 
Marks et aL, J. MoL Biol.. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
1 0 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10 T 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
15 Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnolog y 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol J3 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animaTs endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

10 U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F^ fragment; (iii) an F fl b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature. 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

1 0 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 1 3 May 
1993, and in Traunecker et aL, 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

15 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 aL, Methods in Bnzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethyiamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547- 1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and V L domains of one fragment are forced to pair with the complementary V L and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRHI (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an aim which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 



5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

1 0 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 



5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1195(1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2I2 Bi, ,3, I, I3, In, 90 Y,and 186 Re. 
1 0 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3--(2~pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
1 5 bis-(p-diazoniumbenzoyl>ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et aL, Science, 238: 1098 (1987). 
Carbon- 14-Iabeled l-isothiocyanatobenzyI-3-methyIdiethyIene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent 



4-14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:M786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 
software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et a!., J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 
be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPatteni (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al. Techniques in Immunocytochemistry', 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice ' 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO.l- 
1 786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,1 88 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 



91 



WO 01/53312 PCT/US00/34263 

chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, caixier or affected individuals. 

10 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

15 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagataef a/., 1985;Dahlene/0/., 1987; Morrissey& Collins, (1989) MoL Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et aL, 1 988; 1 989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a linker. For example, Broudetf al. (1 994) Proc. Natl. Acad. Sci. USA 91 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, feat are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. CovaLink NH is a polystyrene surfece grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussene/o/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet aL, (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilizationusing 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
CovaLink NH secondary amino groups that are positioned at the end of spacer aims covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturmgforl0mm.at95 6 Candcoolingoniceforl0min. Ice-cold 0.1 M 1-memylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentrationof 10 mM 1-Mefci 7 . A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M l-emyl-3-(3-d1memylammopropyl)-carbodiimide(EDC), dissolved in 
lOmM 1-Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 
described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotidebound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotideis then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotidefrom the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparationofDNA probe for the preparationof DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodoretal. (1991)Science 251(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et aL (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 99 1 ), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-araine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generatedsynthesisdescribedby Peaseef a/., (1994) PNAS USA 91(11) 5022-6, incorporated 
herein by reference). These authors used current photolithographictechniques to generate arrays of 
immobilized oligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotideprobes in high-density, miniaturized arrays, utilize photolabile 
5 ! -protectedi^acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook*/ 
al (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 



94 



WO 01/5331 2 PCT/US00/34263 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alterthe specificity of 
this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated Ihe 
randomness of this fragmentation strategy, using a Cv/JI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CvfJl** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespectiveof the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solutionfor 2-5 minutes at 80-90°C. The solutionis then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art 

422 PREPARATION OF DNA ARRAYS 

Airays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiterplate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be foiraed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitionedby physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixedphysical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
10 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
fimctionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specificationare hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subj ected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplificationof cDNA Ends) was performed to further extend the sequence in the 5* direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-53S8 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virpinia > edu^ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready , ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 1 17, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. ' 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukacyotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 7, gb pri 1 1 7, 

25 UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table I shows the various tissue sources of SEQ ID NO: 328-1413. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-141 3 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 ■ examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 5.3,2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

UniGene version 1 17, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 14 14-1 652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 

The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 -examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process werephredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 653-1 745. 
5 Table 1 shows the various tissue sources of SEQ ID NO: 1653-1 745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLAST? version 2.0al 
19MP-WashU search against Genpept release 11 8, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing, The homologues 
10 with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
1 5 the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et ah, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaiyotic and eukaryotic signal peptides and their cleavage sites are also 

25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication w Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 1 0, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5,2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbESTveraon 119, gbpri 119, 
5 UniGene version 1 1 9, Genpept release 1 1 9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1 768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 
10 The homology for SEQ ID NO: 1746-1768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologuesforSEQIDNO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incoiporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication 6t 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1 , pp. 1 -6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.62 EXAMPLES 

Novel Nucleic A™^ 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 20, gb pri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 
The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
1 5 1 9MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to detennine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
25 pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be detennine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1 , pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin 



adult brain 



RNA Source 



Hyseq 
Mbrary Name 



SEQ ID NOS: 



GIBCO 



AB3001 



9 19-21 50-51 65-66 72 78 80 82 
8S 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
298 301 321 326 331-332 334 356 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 • 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1445 1468 1482 1493- 
1494 1501-1503 1506-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



GIBCO 



ABD003 



3 12-14 18-19 25 30-31 34-36" 43^ 
45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-588 593 595 597 
601 606-609 516-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult brain 



adult brain 



Clontech 



ABR0OI 



1097 1103 1107 1103 1112 1U6-" 
1117 1119 1121 1124 1127 1130 
1134 1144-1145 1149 1151 1157- 
1158 1167 1170 1178 1184 1188 
1190 1193-1194 1200 1202 1215- 
1217 1220 1226-1227 1229 1231 
1241 1243 1247 1252 1256 1263 
1267 1269 1279 1281 1284 1286- 
1289 1293-1294 1306-1307 1312 
1316-1320 1326 1333 1338 1341 
1344 1348 1351 1355-1357 1368 
1374 1377 1380 1386 1389-1390 
1394 1400 1409 1414 1422-1423 
1425-1427 1437 1443 1446 1454 
1456 1458-1459 1468 1470-1472 
1478 1482-1483 1487-1488 1493 
1497 1499 1506 1508-1511 1517 
1522-1524 1530-1533 1545-1546 
1548-1550 1552 1557-1S59 ■ 1S63 
1565 1567 1569 1571 1S86 1588 
1591 1593 1595 1S98-1601 1608 
1611 1620-1621 1624-1626 1628 
1630-1632 1636 1640-1641 1644- 
1645 1647 1649 1653-165S 1657 
1664 1667 1669 1673 1678-1681 
1686 1690 1694-1696 1701 1709 
1711 1719 1722-1723 1726-1727 
1731-1733 1738 1740 1743-1744 
1747 1749 1753 1757-17S8 1760- 
1761 1765 1771 1785 
9 29 68-69 113 115 146 152 206 



223 245 277 307 320 324 330-331 
344 34 8 352 362 379 384 393 404 
408 414 441-442 4S4 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 833 865 B71 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1S26 1645 1653 1754 1759 1770 
1786 



Clontech 



ABR006 



5-8 15-16 168 212-213 271 278 

280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



Clontech 



ABRO08 



-10 13-19 22-23 25 29 33 37-39 ~ 
43-45 50-51 54-55 57-58 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-20S 207- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



S5Q ID NOS: 



208 210 214-215 218 221-22$ 229' 
231-232 234-241 245-247 251-253 
2S5 257-259 268-269 271 276-281 
285*286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 325-326 328 330-331 333-338 
341 344-347 349 352 3S4 356-357 
362 369-373 376 379-380 382 384 
387 390-391 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 453-455 462 464 
467 469-471 476 478 482-484 488- 
491 497 503 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 565-567 
572-574 577 581 585 587-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 645-647 
651-653 655-664 669-671 673 679 
682 687 6B9 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 755 758-759 
762-764 766 768 773-778 780-782 
734-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 842-843 856-859 
861-862 86S 867-872 874-875 881 
883-884.887 889-892 894-895 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 1057 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 1160- 
1163 1167 1169 1172 1175 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 1305-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1S19-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1586-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1669 1671- 
1674 1676-1684 1686 1689-1690 
1694-1696 1704-1705 1708-1709 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS; 



adult brain 



Clontech 



ABR011 



1720-1724 1726-1723 1730-17*3 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 17B6 



adult brain 



24 7^ 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



BioChain 



ABR012 



adult brain 



adult brain 



adult brain 



adult brain 
adult brain 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



Invitrogen 



ABR013 



Invitrogen 



185 264-205 364-365 393 497 595" 
687 692-694 830 845 1068 1320 
1413 1640 



ABR014 



Invitrogen 



ABR015 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



419 434-435 441-442 763 789 983 
1320 



Invitrogen 



ABR016 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



Invitrpgen 



ABT004 



743 
803 
874 



cultured 
preadipocytes 



14-16 " 22-23 25 37-3$ 43 S8 60 

70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 198 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-505 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 
750 753 766 77B 780-781 789 
814 826 B30 837 841 857 869 
894-895 925 937 949 954-956 960- 
961 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 1258 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1536 1547 
1554 1557-1559 1551-1S62 1567 
1585 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



Strategene 



ADP001 



5-8 11 17 25 68-69 
105 110 116 136-138 
189 196-198 261 267 
301 318 331 336-338 
400 428 430-431 510 
527 549 557 561 602 
631 637 647 670 681 
748 782 793-794 817 
845 858-859 879 882 
960 982 986 995-996 
1005-1007 1025 1027 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
168 171 188- 
276 288 293 
379-380 391 
512 520 524 
618 620 622 
682 710 731 
834-836 843 
893-895 934 
1000 1002 
-1028 1032 
1097 1099- 
1219-1220 
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Tissue Origin 



adrenal gland 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1260 

1322 

1370- 

1437 

1602 

1660 

1711 

1760- 



1271 

1329 

1371 

1466 

1608 

1662 

1719- 

1761 



1297 
1339 
1398 
1468 
1614 
1673 
1720 
1765 



"129F 
1345 
1408 
1533 
1631 
1687- 
1742 
1767 



1314" 
1365- 
1423 
1539 
1649- 
1688 
1746 
1771 



1366 
1431 
1594 
1650 
1696 
1749 
1785 



CI on tech 



ADR002 



adult heart 



4-10 15-14 23 29-31 43-4$ 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 2S1 255 267-269 271 280- 
281 285 295 298 311 336-338* 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
B69 875 883 B98 904 912 922-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-1135 1151 115B 1163 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 142S- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1546 1S67 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



GIBCO 



AHR001 



4-8 10-11 15-16 
46 50-52 57-58 
85 87 89 94 97 
110 112 114 116 
127 130-132 134 
147-151 153 163 
186 192 195 197 
215 220 225-226 
236 251 257-260 
277 280-282 285 
298-301 304 307 
325 330 333 336 
352 354 358 361 
384 387-388 391 
408-409 411-412 
433-439 445-446 
457 459 462 469 
483-484 487-490 
503 506 508 510- 
526 534 536-540 
560-562 574-577 
587 589 593 595 
612 615-620 622- 
645-652 656-660 
674-675 683-684 
701 709 712 715- 



18-21 3 
60 62-63 
100 103- 
118-119 
136-138 
•164 168 
199 204 
229-230 
262 265 
286 289 
309 314 
338 345 
368 370 
393 397 
414-416 
449 452 
472-473 
492-493 
513 516 
542 546 
581-582 
597 604 
623 626 
665-666 
687 692 
716 719- 



4-39 44- 

71 75 82 
104 10 8- 
122-123 
141-144 
171 179 
205 212- 
232 234- 
272 274 
-292 296 
321 324- 
349 351- 
380 383- 
401 406 
430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
609 611- 
632 637 
670-672 
694 697 
720 725- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOSf 



adult kidney 



GIBCO 



AKD002 



726 728 730-732 735 738-739 743^ 
744 746 751 753 759 761 765 770- 
771 775-780 785 788-790 796 802 
804 810 812 817 821 826 828 830 
637 843 845-847 849-853 857-861 
863-864 869 871 875 877-879 881 
883 887 890-892 894-895 897-898 
901 903 906-907 911-913 915 919 
921-925 927-928 933-935 945 958 
961-963 967 969-972 975 977-978 
980-986 990 992 999-1002 1005- 
1007 1010 1016 1019-1020 1022- " 
1023 1025 1028-1037 1039-1040 
1043 1047 1050 1054-1055 1057 
1059 1063-1064 1067-1068 1070 
1072 1075-1076 1083 1085-1087 
1089 1093-1094 1104 1106 1108- 
1109 1113 1116-1117 1119 1121 
1124 1126 1128 1131-1134 1144- 
1145 1148-1149 1151 1158 1167 
1169-1170 1175 1177 1192 1196 
1199-1200 1202 1206-1208 1211 
1216 1218 1222 1227-1229 1232- 
1235 1238-1241 1243-1244 1247- 
1248 1250 1253-1254 1256-1258 
1261 1268 1270-1271 1277 1280- 
1282 1287 1292 1298-1299 1306 
1308 1317-1321 1324-1325 1330 
1332 1334-1337 1339 1344-1345 
1349-1350 1354-1356 1359-1360 
1365-1366 1369 1371 1374-1375 
1378-1380 1383-1384 1389 1397 
1400 1403 1409 1417 1423-1426 
1437 1439 1442 1444 1446-1447 
1450 1453 1468 1470 1473 1479 
1481 1488 1490 1501-1504 1519 
1521 1524 1S2B 1530-1534 1536- 
1537 1539 1541-1542 1547 1553 
1555 1560 1565 1567-1571 1588 
1591 1597-1598 1601-1602 1605 
1614-1616 1619-1620 1623-1628 
1630-1632 1634 1636 1641 1644- 
1645 1647 1649 1652-1655 1659 
1662 1667 1673-1674 1680-1681 
1684 1686-1688 1704-1705 1709 
1711-1712 1717 1724 1726-1727 
1731-1733 1737-1738 1741 1743- 
1744 1749 1754-1755 1760-1761 
1765 1772 1785 



4-8 10-11 1 7-21 29-31 35^39 42- 
45 50-51 56-58 60-61 64 68-69 75 
77 80 82 35 87 92-94 97 100 102- 
104 107-108 112 116-117 119 123 
127-133 136-137 139-141 143-144 
147-154 157 161-163 16S-166 169 
172 176 178-179 192 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 265-269 271- 
272 274 276-277 279-281 284-2B6 
290 293 296 298-299 301-302 304 
307 311-313 321 325-326 329-331 
333 341 344 348-350 352 356 358- 
359 362 364-365 368 370-372 374 
376-377 380-3B2 392 395 398 400- 
401 404 407-409 414-415 423-424 
430-437 443-444 446 449 451 453- 
455 459 461-462 464 467 469 471- 
474 476-477 480-481 483 487-488 
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Tissue Origin 



RNA Source 



adult kidney 



Invitrogen" 



Hyseq 
Library Name 



SEQ ID NOS: 



490-451 493 497-505 
520 522 524 526-529 
544 547 549 554-556 
567 571-576 578 582 
593 598-599 601 604 
615-619 621-626 632 
645-652 655 660-664 
678-679 688 652-695 
713 717 719-720 727 
738 743 745-746 751 
763 765 771*773 775 
788 793 795-796 800 
810-812 814-819 821 
834-838 842-645 848 
864-865 867 869 871 
886-887 889-891 893 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-958 
964 969-970 972 976 
988-990 992-993 995 
1004-1008 1010 1012 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057 
1070-1073 1078 1085 
1089 1092 1094 1097 
1107 1109-1112 1116 
1123-1125 1132-1135 
1143 1146-1147 1149 
11S4 1157 1159 1163 
1178-1179 1181 1183 
1200 1202-1204 1206 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257 
1261 1267-1268 1270 
1281 1283 1287-1289 
1299 1306 1308 1311 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 1367 1369 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-1431 1433 1437- 
1443 1445-1446 1448- 
1454 1459 1461 1465- 
1475 1478.1484-1488 
1493 1495 1497-1498 
1S09 1512 1518 1521- 
1527-1528 1532-1533 
1541 1547-1550 1552 
1561 1565-1566 1568 
1578-1579 1583 1586- 
1591-1592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624 
1632 1634-1636 1638- 
1644 1646-1649 1653 
1664 1666-1667 1670 
1679 1683-1684 1686 
1696-1699 1701 1709- 
1714 1716-1719 1723- 
1727 1733 1737-1738 
1744 1748-1749 1751 
1763-1768 1778 1780 



510-513 516- 
534 537-540 
560 562 564 
586-589 592- 
-606 6C8-613 
634 637-643 
669-672 676 
698 702 711 
731 735-736 
753 755 762- 
-778 780 786 
803 805 808 
826 829 832 
-B55 857-861 
874 876-883 
896 898-900 
918 920 922 
940-942 945 
960-961 963- 
-978 982-986 
997 999-1002 
1013 1016- 
1025-1031 
1044 1047 
-1064 1068 
-1086 1088- 
1099-1102 
•1119 1121 
1140 1142- 
■1150 1153- 
1167 1170 
1192 1196- 
•1211 1216- 
1227-1230 
1243-1244 
1258 1260- 
1272-1274 
1293-1295 
1313 1317- 
1334-1335 
1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1438 1442- 
1450 1453- 
1468 1474- 
1490 1492- 
1506-1507 
1522 1525 
1537 1540- 
1556-1559 
1571 1575 
1587 1585 
1600 1603- 
1613 1615- 
1628 1631- 
1639 1641 
1656 1662 
1671 1676- 
1691-1692 
1711 1713- 
1724 1726- 
1741 1743- 
1760-1761 
1785 



"AKT002 



20-21 37-39 47 52 57 60 65-66 
68-69 80 104 107-108 122 130 133 
136-137 140 142-143 145 165 174 
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Tissue Origin 


RNA Source 


Hyseq 




SEQ 


ID NOS: 








Library Name 


















iei 


197 227-Sid 235 


-236 


244 251 








261 


-265 267 280-281 


286 


290 299 








301 


304-305 309 312 


-313 


339 341 








344 


-345 349 358 370 


-372 


376 382- 








363 


387 392 401 414 


416 


421 430 








443 


445 449 453-454 


472 


487-488 








504 


506 513 516 519 


522 


528 536- 








540 


546 554 585 587 


594 


598 602 








607 


616-617 626-627 


636 


643 662- 








664 


695 709 721 735 


743 


761 768 








775 


-777 788 796 804 


814 


827 837- 








B38 


849-850 852-853 


869- 


870 881 








890 


-892 898 903 90S 


-907 


914 919 








925 


927 934 941 949 


952 


957 960 








962 


968 970 1000 1008 1029-1030 








1044 1052 1055 


1063 


1067 


-1068 








1073 1085 1099- 


-1102 


1107 


1110- 








1111 1113 1115 


1119 


1126 


1134 








1136-1137 1146-1148 


1153 


1159 








1192 1196 1199 


1232 


-1233 


1241 








1256 1264 1272-1273 


1281 


1285 








1293-1294 1299 


1312 


1320 


1324- 








1325 1330 1344 


1349 


1351 


1355- 








1356 1369 1378- 


1379 


1403 


1414 








1419 1428-1429 


1436 


1446 


1458 








1463-1464 1467- 


1468 


1470 


1477- 








1478 1486 1491 


1509 


1519 


1527 








1529 1534 1547 


1596 


1600 


1619 








1623 1629 1631 


1634 


1638 


1643 








1647 1652 1660 


1664 


1667 


1669- 








1670 1673 1686 


1709 


1727 


1740 








1776 








adult lung 


GIBCO 


ALG0O1 


4-8 


14 37-39 44 


-46 50-51 


56 62- 








63 75 B2 88 93 


103-104 113 125 








133 


140 143 150 


152 


154 157 162 








171- 


172 174-175 


190-191 196 200 








211 


214 219 223 


-224 


227-22B 251- 








252 


256 265 272 


274 


280-281 285 








310 


332 345 351 


362 


371 381-382 








394 


408-409 431 


436 


445 454 459 








461 


467 469 471 


476- 


477 488 S04 








513 


527 537-540 


544 


547-548 554 








564 


583 607 616 


-617 


621 623-624 








634 


645-646 662 


-664 


670 695 716 








719 


743-744 763 


766 


774 789 803 








811 


814 817 831 


-832 


B37-838 845 








'852- 


853 858-859 


861 


866 880 887 








901 


905 941 954 


-957 


966 971 977 








979 


981 987 990 


992 


996 1001 








1005 


-1006 1014 1017 


1045 


1047 








1054 


1059 1062 1064 


1072 


1080 








1086 


-1089 1094 1107 


1126 


1134 








1136 


-1137 1142 1150 


1157 


1173 








1190 


1200 1208 1220 


1241 


1272- 








1273 


1280 1282 1295 


1306 


1320 








1331 


-1332 1353 1374 


1379 


1383- 








1364 


1404 1409 1423 


1434 


1436 








1442 


1474 1478 1494 


1509 


1522 








1525 


1531-1532 1547 


1549 


1553- 








1554 


1571 1598 16C6 


1613 


1624 








1627 


-1629 i632 1642 


1644 


1662 








1669 


1676-1677 1684 


1696 


1727 








1731 
1786 


1732 1737-1738 


1748- 


1749 


lymph node 


Clontech 


ALN001 


4 24 


50-51 82 105 137 153 


198 








201 223-224 234 


268-269 272 280- 








281 287 301 312 


329 343 382 421 








430 433 445 451 


461-462 475 481- 








482 503 526 529 


537-540 546-547 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



621 626 649 679 719 
793 803 831 834-836 
858 866 875 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



young liver 



GIBCO 



ALV001 



adult liver" 



Invxtrogen 



ALV002 



5-8 11 20-21 46 50-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 4S5 459 476 478 483 
493 510-512 516 520 522 526 536 
S49 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 67B 697-698 700 
717 719 728 730 734 738 744-745 
773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1S65 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1 763-1765 1769 

8 17 20-21 32-33 41 55 58 64 



75 77 86 89 102 108 117 119 175 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 39B 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
11S9 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 132D 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult liver 
adult ovary 



T55T 
1597 
1618- 
1647 
1669- 
1738 
1765 



156T 
1601- 
1619 
1652 
1671 
1742- 
1772 



1578 1581 
1602 1611- 
1621 1625 
1654-16S5 
1684 1706 
1744 1760- 
1774 



1583 1594 
1612 1615 
1637 1645 
1660 1666 
1722 1737- 
1761 1753- 



Clontech 
Invitrogen 



ALV003 
AOV001 



29 676 997 1063 1119 1536 1766 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 S66-567 569-570 
572-573 575-576 579 581 503 585- 
588 590-591 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 ' 
800 803 805 809-811 813-815 818- 
819 821-824 826 82B-829 831-832 
837-838 843-050 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-950 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 HCff-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1151 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1299 

1323 

1338- 

1359 

1377- 

1394 

1427 

1443 

1463- 

1481 

1494 

1507 

1526- 

1538- 

1553 

1567 

1578 

1S91 

1609 

1636 

1657 

1671 

1690 

1713- 

1726- 

1738 

1751 

1765 

1778- 



1306 

1327 

1339 

1361 

1379 

1400 

1429- 

1445- 

1464 

1484- 

1496- 

1511- 

1527 

1539 

1555- 

1569- 

1SB0- 

1595 

1611 - 

1638 

1659- 

1673- 

1699 

1714 

1728 

1740- 

1753 

1767- 

1779 



1308 
1329 
1341 
1365 
1383 
1404 
1431 
1450 
X466 
1485 
1498 
1517 
1530 
1541 
1559 
1570 
1581 
1597 
1621 
1641 
1662 
1674 
1702 
1716 
1731 
1741 
1755 
176B 
1783 



1312 
•1330 
1343- 
-1366 
-1384 
1416- 
1435- 
1453- 
1468 
1488 
1501- 
1519 
•1531 
1546 
1561- 
1572 
1587- 
-1598 
1623- 
1643 
1664 
1676- 
-1707 
-1719 
-1733 
1743- 
-17S6 
1770- 
-1784 



1317- 

1332- 

1351 

1371- 

1386 

1417 

1436 

1454 

1470 

1491 

1504 

1521- 

1534- 

1548- 

1563 

1574- 

1588 

1600- 

1630 

1645 

1667 

1681 

1710- 

1723- 

1735 

1744 

1760- 

1771 

1786 



1321 

1333 

1356 

1375 

1389 

1422- 

1439- 

1459 

1474- 

1493- 

1506- 

1524 

1536 

1550 

1566- 

1575 

1590- 

1606 

1634 

1647- 

1669- 

1683- 

1711 

1724 

1737- 

1748- 

1762 

1776 



adult placenta" 



Clontech 



APL001 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



Invitrogen 



APL002 



14-16 2* "29" 43 d0-6l 79-80 103 
106 116 135 171 177 180 194 196 
198 210 216 235-236 272 290 299 
309 329 334 339 359 379-380 417 
423 430 434-435 448 454 483 490- 
491 517 522 631 723 725-726 728 
738 746 769 818 843 854-855 857- 
858 916 948 953-954 976 988-989 
1005-1006 1013 1033 1036 1064 
1068 1070 1086 1139 1144-1145 
1160 1277 1285 1317-1320 1343 
1345 1429 1435 1438 1454 1482 
1486 1490 1512 1519 1532 1549 
1592-1593 1602 1626 1647 1649 
1664 1673 1675 1722 1727 1730 
1746 1776 



placenta 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 

•436 446 
500 503 

•540 547 
595 604 

-632 642 

•675 684 
742-744 
789 794 
845 848 
B79 882 



adult" spleen 



GIBCO 



ASP001 



3 5-8 12 15- 
44-45 57 60 
103 106 108 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574- 
611-612 620- 
652 659 661 
700 721 728 
746 762 765 
810-811 
852-853 



817 
858 



16 19-21 24 
62-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434 
481 490-493 
530 534 536 
576 582 592 
621 623 631 
667 671 673 
730 732 738 
774 780 788 
822 830 832 
862 866 874 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 
927 934 942 949 957 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1258 1269 1271 



921-923 926- 
958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 

•1487 1498 

•1549 1553 
1631 1636 
1662 1670 
1686 1700 

-1741 1760- 

-1782 



1246 
1301 



1320 1322 1330 



1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485 
1512 1522 1525 1544 
1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



testis 



GIBCO 



ATS001 



Genomic dna 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC661 " 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-553 563- 
564 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 66S 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78JB-789 
802 804 809 811 S14 826 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 981 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399 1 1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675' 1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 

1767 1779 

'684 13*2 1412" 



1411-1412 
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Tissue Origin 



Genomic DNA 
from BAC 39316 



RNA Source 



Research 
Genetics 
(CITB BAC 
Library) 
Invitrogen 



Hyseq 
Library Name 



SEQ ID NOS: 



BAC003 



TJ5T 



adult bladder 



BLD001 



5-8 17-18 22-23 33 
80 93 100 120-121 1 
251-252 272 278 311 
413 415 424 430 443 
543 562 564 607 616 
652 667 671 710 727 
773 786 788 837 840 
909 918 929 966 977 
1025 1055* 1073 1082 
1185 1189 1199 1270 
1536 1560 1573 1596 
1637 1649-1650 1654 
1669 1671 1690 1719 
1732 1739 1741 1760 



37-39 56-57 
69 201 237 
348 363 382 
483 502 542- 
617 626 635 
755-756 762 
866 893 896 
983 1016 
1140 1167 
1369 1481 
1614 1636- 
-1655 1658 
1727 1731- 
1761 1779 



bone marrow 



Clontech. 



BMD001 



3-8 11 13 IB 29-31 33 35-36 40 
43-4S 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-956 963 969 973 
976 981 985 987 990 992 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 I486 1493-1494 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



TS&T 
1526 
1546 
1557- 
1592 
1626- 
1638- 
1653- 
1684 
1713- 
1727 
1772 



TsbT 

1528 
154B- 
1559 
1597- 
1628 
1639 
1655 
1686 
1714 
1737- 
1781- 



TSTT 

1531 

1549 

1571- 

1600 

1630- 

1641 

1661- 

1690 

1717 

1738 

1782 



1521 
1536 
1552 
1572 
1609 
1632 
1646 
1662 
1702 
1720 
1740 
1785 



T52T 
-1537 
1554- 
1581 
1614 
1634 
•1647 
1676- 
1707 
1722- 
1758 
1786 



1524 

1543 

1555 

1589- 

1621 

1636 

1651 

1681 

1711 

1723 

1767 



bone marrow 



Clontech 



BMD002 



bone marrow 
bone marrow 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 5S2 566 
569-570 581 583 590-591 597-S98 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 830 834- 
836 854-855 859 866 869 871 078- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 135S-13S7 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



Clontech 
Clontech 



BMD004 



73-74 503 922 1036 1711 



BMD007 



95-96 866 1320 1475 



adult colon 



Invitrogen 



CXiNOOl 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Mixture of 16 
tissues - 
mRNAs 



Various 
Vendors 



1462-1464 lbl2 1556 1^83 1587" 
1594 1596 1614 162S-1626 1631 
1639 1645 1650 1675-1677 1687- 
1688 1701 1713-1714 1724 1740 
1765 



CTL016 



Toi 1490 1686" 



tissues - 
mRNAs" 
adult cervix 



Various 
Vendors 



CTL021 



312 7B2 1132-1133 1403 1712 1715" 



BioChain 



CVX001 



1 4-B 11 13 18-21 2S -26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
196 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 557 S61 572-573 575-577 581- 
582 585-586 5B8-S89 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 80S 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
B80 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 2053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 11B1 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 134S 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as Mows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Inviirogen), 3) normal aMt liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphabiastic mRNA (Clontech), 1 1) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human concepu'onal umbilical cord mRNA (BioChain). 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS; 



diaphragm 



BioChain 



endothelial 
cells 



DIA602 
EDT001 



1406 1416! 1455-1427' 
1437 1442 1446 1448 
1466 1472 1478 1482 
1503 1506 1512 1522 
1531 1533 1541 1547 
1585 1589 1597-1598 
1609 1614-1616 1620 
1626-1628 1630 1638 
1649 1653 1656 1662 
1674-1675 1683 1685- 
1702 1709-1710 1715 
1724 1729 1731-1732 
1741 1743-1744 1748- 
1760-1762 1767 1773 

1786 

137 282 289 730 780 
1478 1599 1614 



1431 

1453 

1496 

1527- 

1569 

1600 

1623- 

1641 

1667 

1688 

1717 

1735- 

1749 

1778 



1436- 
14S9 
1501- 
•1528 
1571 
1608- 
1624 
1643 
1669 
1699 
1722 
1739 
1755 
1785- 



986 1409 



Strategene 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 106 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 152-153 
161-163 166-172 176-179 187 190 
192 194 196-201 204-207 210 212- 
214 220 224 225-230 233 235-236 
240-241 251-252 258 261-262 265 
267-269 272 276-277 279-281 284- 
285 288 290 295-296 301-302 310- 
311 313 316 321 325 329 331-333 
335 340-342 351-355 360 371 375 
380-382 364 387 390 392 397 400 
407-408 410 412 414 416 425-427 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
519 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 
572-576 579 581 585-586 589 593 
595 597 599 603 607-612 615-617 
620 622 626 630 632-634 638-641 
644 647 656-660 662-664 670 673 
678 680-682 692-697 707 709-710 
712-713 719 730 732 734 736 738 
743-746 751 759 768 771 773 775- 
778. 783 786-789 793 800 803 805- 
807 810-811 814 816-818 821-822 
824 826 828-829 832 834-838 842- 
845 848-850 854-860 862 864 869 
871 874 876-879 883 885 887 890- 
891 894-855 898-900 903 908 910- 
913 916 919-922 924 926-928 930- 
935 939 943 948-949 951-954 957 
959-961 964 969-970 973 975-978 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1X17 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 1205-1207 1211 
1216-1217 1219 1221 1225 1229 
1232-1235 1238-1241 1243-1244 
1246 1250 1253 12S7-1258 1261 
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Tissue Origin 



RNA Source 



Hyseq 
library Name 



SEQ ID NOS: 



1265-1266 1268 1270-1271 1274- 
1277 1280-1283 1285-1286 1288- 
1290 1293 1295 1298 1308 1312 
1317*1320 1324-1325 1327 1329- 
1230 1334-1335 1338 1342-1343 
1345-1347 13S0 1355-1356 1359 
1367 1369 1374 1376 1379 1398 
1400 1406 1408 1414 1417 1419 
1424-1426 1428-1431 1434-1438 
1440-1442 1448 1450 1462-1466 
1468 1472 1474 1478 1487-1488 
1491-1493 1501-1504 1506 1509 
1511 1516 1520-1521 1526 1529 
1531 1536-1537 1539-1540 1546- 
1547 1549 1552 1555 1557-1559 
1561-1565 1568 1571 1575 1578- 
1579 1581-1583 1587-1588 1590 
1592 1597 1605-1606 1611 1613 
1615 1618-1621 1624-1628 1630- 
1631 1634 1636 1638 1641 1643- 
1650 1652-1659 1664 1666-1667 
1669 1671 1675-1681 1683-168B 
1696-1698 1703 1711 1715-1716 
1719 1722-1723 1726 1731-1733 
1736 1739-1741 1743-1744 1749 
1755 1760-1761 1765 1767-1768 
1771-1773 1776 1779 1783-1786 
286 686 1297 1303-1304 1352 
1411-1412 1754 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



131-132 261 289 380 503 860 892 
1000 1007 1397 



esophagus 



BioChain 



ESO002 



62-63 89 112 126 1S4 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



Tetal brain 



CI on tech 



FBR001 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 605 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



fetal brain 



Clontech 



PBR004 



fetal brain 



Clontech 



FBRQ06 



5-9 25 43 60 62-63 65-66 70 72 
80 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 547 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 638-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-697 699 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
606-807 810 817-818 826 839 843 
B58 861 864 871-872 884 890-891 
894-895 890 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 
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SBQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



999 1001 1005-1006 1008 1013 
1015 1022 1024 1029-1030 1032 
1035 1042 1047-1048 1052 1056 
1065 1067 1070 1082 1089 1109 
1114-1115 1119 1131 1143-1149 
1151 1153-1156 1160 1163 1167 
1172-1373 1178 1184 1186 1188 
1190-1200 1211 1216 1222-1223 
1226-1227 1229 1231 1236 124S 
1253-1255 1258 12G0 1262 1266 
1270-1273 1281 1287 1308-1309 
1314 1317-1320 1326 1334-1335 
1339 1341 1344 1350 1356 1369- 
1371 1373 1376 1379 1381-1382 
1386 1392 1396-1398 1419 1423 
1425-1426 1428-1429 1432 1437 
1440-1441 1448 1466 1470 1462 
1502-1503 1507 1511 1513 1516 
1519 1536 1544 1549-1550 1557- 
1559 1573 1589-1590 1598 1608 
1611-1614 1619 1621 1625-1626 
1640 1651 1657-1658 1676-1679 
1693 1696 1703-1704 1713-1714 
1718 1720 1722 1724 1726 1728 
1730-1733 1735-1736 1738-1739 
1742 1745 1755 1759-1761 1765 
1767 1771-1772 1777 1779-1780 
1786 



235-236 520 864 1068 1188 1S87 
15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-156 171 174 161 
185 196-198 204-205 208 223 230 
235-236 251 253 2S1 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 111S 111B 1120*1128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 12S7 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 159S 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 17S7 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal brain 



Clontech 



PBRS03 



fetal brain 



Invitrogen 



FBT002 



fetal heart 



Invitrogen 



FHR001 



105 124 160 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



PKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal kidney 



Clontech 



PKD002 



258 277 280-281 307 310 314 330 
371 387 392 395 4*03 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 780 
738 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 



fetal kidney 
fetal lung 



Invitrogen 
Cl on tech 



FKD007 



352 384 426-427 440 ^83 602 1060 

1131 1324-1325 1636 

20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



FLG001 



fetal lung 



35-36 94 323 ^71 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1425 1493 1567 1576 1620 1686 



Invitrogen 



?LG003 



fetal lung 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
468 475 4Q3 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 1381 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1557-1560 1557 1S90 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



Clontech 



FLG004 



fetal liver- 
spleen 



103 276 334 465-466 737 843 1131 
1614 1658 



Columbia 
University 



FLS001 



3-11 13 15- 
Sl 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188- 
200 202-206 
233-236 240- 
255-256 258 
274 276-278 
293 295 299- 
311 314 316 
332 342 344- 
358 360 362 
386-387 390 
406 408 410- 
437 439-442 
456 459 461- 
487-488 490- 
506 509-513 
525 531 534 
553-554 561- 
576 579 581 



21 25 30-35 41-48 50- 
60-66 68-69 72 75 
85 87 89 92-103 105- 
-124 126-127 130 133 
144 147-149 152-153 
167-172 174 176-178 
•190 193-194 196 198- 
210-214 219 221-231 
•244 246-247 250-251 
261-265 268-269 272 
280-281 284-286 288 
•301 304 306-307 309 
318 320-321 326 329- 
-345 350 352-353 356- 
370-374 376 378-3B4 
392-393 400-401 403 
412 415 417 419 422- 
444-445 448 452-454 
470 472-479 481-483 
491 493 500-501 503- 
515-520 522-524 526- 
536-540 542 547-549 
562 564 567-568 571- 
583 585-597 599-605 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID KOS: 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-719 725-728 730- 
731 734 736 738 740-741 743-746 
748 750-751 759-766 768 772 7v74- 
777 779 783-788 793 796 798 800- 
805 608 010-812 814 818-819 821- 
B24 826-832 834-837 843-847 849- 
867 869-876 878-883 887 889*895 
897-898 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
960-961 963-965 367 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 134S-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 14S8-1459 1466- 
1470 1472 1474 1477-147B 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 15B7-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



fetal liver- 
spleen 



Columbia 
university 



PLS002 



3-11" 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyceq 
Library Name 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451*452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 503- 
S05 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 S93 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-856 
358-861 865 867 869 871 873-874 
876 878 881-862 887 889 892 894- 
898 901-502 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1140-1150 
1156 1158 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 1500-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



-1528 
1646- 
•1662 
•1679 
-1692 
•1714 
1730- 
1748- 
•1764 
1779 



1630- 
1649 
1664 
1683- 
1699 
1717 
1733 
1752 
1767 
1783- 



1601 

1631 

16S2 

1667- 

1684 

1702 

1719 

1738 

1758 

1769 

1786 



16TT 
1635 
1654 
1669 
1686 
1707 
1722 
1740 
1760 
1772 



T612" 
■1638 
■1659 
1674 
•1688 
1711 
1726- 
1743- 
-1761 
1773 



1597 
1618 
1641 
1661 
1676 
1691 
1713 
1727 
1744 
1763 
1776 



fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



15-16 26 34~58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 2S9 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 037 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



Invitrogen 



FLV001 



676 998 1719 

93 133 214 301 355 374 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



fetal .liver 



Clontech 



-FLV002 



fetal liver 



Clontech 



FLV004 



26 37-39 50-51 58 84 86 89 98 
113 128 131-132 139 155 172 186 
194 198 201 206 211 230-231 256 
261 276 282 286 302 325 359 361 
375 379 383 398 412-413 419 430 
436 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 788 
826 837 860 874 913 915 921 935 
970 980 986 908-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 1070 1083 1097 



"fetal muscle 



Invitrogen 



FMS001 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name_ 



SEQ ID HOS: 



"1099- 
1173 
1266 
1324- 
1383- 
1433 
1557- 
1632 
1712 
1766 



1102 
1198 
1270 
1325 
1364 
1505 
1559 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1S14 

1562 

1650 

1726 



1117 

1228 

129B 

1336- 

1400 

1542 

1589 

1652 

1743- 



1121 

1240 

1317- 

1337 

1403 

1551 

1599 

1671 

1744 



1164 
12S8 
1320 
1369 
1409 
1S54 
1620 
1675 
17S4 



fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 333 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 669 
672 676 678 681 68B 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1S59 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 



fetal skin 



Invitrogen 



FSK001 
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Tissue origin 



RNA Source 



Hyseq 
Library Name 



SE6 Xl> NOS:" 



1626 

1644 

1665 

1702- 

1724 

1742 

1765 

1786 



1632 
1646 
1668 
1703 
1727 
1747 
1773 



1634 
1654- 
1675 
1709- 
1731- 
1749 
1776- 



1636 
1657 
1685 
1710 
1732 
1755 
1777 



1641 
1660- 
1687- 
1716 
1737- 
1760- 
1779- 



1643- 

1662 

1689 

1719 

1740 

1761 

1780 



fetal skin 



Invitrogen 



FSK002 



fetal spleen 
umbilical corH" 



FSP001 



13 286 302 307 313 
339 341 354 370 372 
408 414 426-427 433 
515 544 585 598 767 
1076 1109 1155 1317 
1333-1335 1343 1347 
1371 1377-1378 1391 
1466 1647 1656 1678 
1688 1693 1718 1721 
1732 1739 1755 



321 330 335 
385 400 402 
436 450 454 
810 845 939 
-1320 1326 
1350 1369- 
1397 1422 
-1679 1687- 
1725 1731- 



110 137 211 353 589 927 1108 
1639 1771 



BioChain 
BioChain 



FUC001 



588- 
637 



fetal brain" 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 8S 101 104 113- 
114 116 119 122-124 133 -37 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 
591 593 606 615 620-621 632 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
9B4 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 10B9 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1662 1684 1686-1668 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



GIBCO 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



72 75 77 80 82 8* 90-91 
102 107 110 112-116 118 
123 126 128 134 136-140 
153-155 157 161 165 169 
181 186 188-189 197-198 
208 210 215 222-223 22S- 
235-238 240-241 247 253 
260-262 267-269 276 279 
286 289 298 300-302 307 
321-323 325 330-331 339 
349 352 354 356-359 362 
371-372 377 379-380 382 
390 400 408 414-416 
434-435 438 441-443 
455 4S7-463 470 472-473 
478 482-483 486-488 490 
496 499-500 502-504 506 
512 516 519-520 522 525 
530 S37-540 543-544 546 
567 569-570 572-582 585 
591 593 595 599 601 604 
611-612 614-620 622-624 
636 643 645-647 650-652 
661 665 667-668 670-672 
681 687 689 692-694 697 
714 717 721 727 729-732 
738 743-746 750-751 759 
770 772 775-777 784 789 
799 802-805 810-811 814 
824 826 830 834-837 839- 
856 858-860 862 864 869 
877 879 883 886-887 890 
895 898-901 905 908-910 
919 922-923 925 527 930 
938 948 952-960 963-964 
972 975 978-979 981 983 
990 992 995 997 999-1002 
1009 1011-1013 1016 1018 
1023 1026 1029-1031 1033 
1038 1041 1047 1050 1053 
1059 1064 1068 1070 1072 
1078-1079 1081-1082 1086 
1094 1097 1103 1107-1109 
1115 1121-1122 1127 1134 
1138 1140 1143 1148-1151 
1156-1157 1159 1167 1170 
1193-1194 1200 1202 1207 
1211 1216 1219-1220 1226 
1229 1232-1234 1240-1241 
1246 1249-1251 1253-1254 
1267-1268 1271 1276 1279 
1285-1289 1293-1294 1305 
1308 1312 1316 1320 1327 
1339 1341-1344 1346 1349 
1357 1359 1365-1366 1369 
1373-1375 1379 1386 1389 
1398 1409 1413-1414 1416 
1420-1421 1425-1427 1430 
1437 1439 1442 1445-1452 
1457 1459 1463-1464 1468 
1474 1477-1479 1489 1492 
1497-1498 1501-1503 1507 
1511-1513 1517 1520-1521 
1526 1531-1533 1535 1537 
1547 1554 1556-1559 1564 
1571 1584 1587 1589 1594 
1601 1611-1612 1614-1616 
1620 1625-1628 1630-1631 
1637-1638 1640-1643 1645 



419 
449 



94 100- 
119 122- 
147-148 
172 175 
204-206 
226 230 
256-258 
281 284 
310 318 
341 346- 
364-365 
364 387 
424 431 
451 453- 
475 477- 
491 493 
507 509- 
526 529- 
547 566- 
588 590- 
606-609 
630 632 
654 659 
676 678 
699 710 
734 736 
763 766 
791 796 
819-821 
850 854- 
871 876- 
891 893- 
912-916 
933 935- 
967 969- 
986-987 

1005- 
-1019 
1035 
1057 
-1073 
1089 
1113- 
-1135 
1153 
1175 
1209 
-1227 
1243 
*1258 
1282 
1307- 
1338- 
1355- 
-1370 
1394 
-1417 
1433 
1454- 
1470 
1494 
1509 
1524- 
1538 
1567 
1599- 
1619- 
1634 
1648- 
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Tissue Origin I rna Source 



Hyseq 
Library Name 



SEQ ID NOSl 



1649 16S1 1653-1655 1657-1658 
1664-1665 1667 1669 1673 1678- 
1679 1683-1684 1686 1693 1701 
1704-1705 1709 1713-1714 1717- 
1720 1724 1727-1728 1731-1733 
1737-1738 1743-1744 17S2 1754- 
1755 1757 1760-1761 1765 1772 
1779 1785 



macrophage 
infant brain 



Invitrogen 



HMP001 



5-8 110 204-205 503 634 678 859 
87B 933 988-989 1379 1448 1504 



Columbia 
University 



IB20CT2 I 10 12-13 15-18 22-23 25 29 34 

37-39 43 47 50-51 54-56 58 50-63 
65-66 68-69 72-74 80 82-83 86 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 12B 130 
134-136 138-139 143 147-149 151- 
152 154-155 163 165-167 169 172- 
175 1B1-184 186 193-196 198 201 
203-205 209-210 214-215 222 224- 
226 231-232 235-236 239 246-247 
252 257 260 268-269 272 276-277 
279-281 286 288 291-292 295 298 
300-301 304 307 310 313 321-323 
330-331 333-334 339 346-347 349 
352 356-357 362 371-372 377 379- 
380 383-384 392 397 401 406 408 
411 413-414 416 418-419 422 428 
430-431 434-435 438 443 449 453- 
4S4 461 464-466 469-470 472-473 
475-476 478 482-483 487 490 492 
494 497 503 507-508 S10-513 516 
519-520 524-526 530-534 536-540 
547 550-551 561 563-564 566-567 
572-576 579 581-582 584-507 590- 
591 593 595-597 607-609 611-613 
616-617 620 622-624 627 631 637 
641 64S-647 650-655 657-658 660- 
665 667-675 689 691 695 697 699 
703 707 713-715 717 721 728-731 
733-736 739 743 745 751 755 7S9 
763 769-770 772 778 780-781 785 
788-789 793-794 799 803 808 811 
814 825-826 830 834-836 840-843 
845 848-850 854-855 860 862 864- 
865 870 872 87S-876 878 886 888 
890-891 894-896 898 903-904 916- 
917 919 922-925 927-928 930-932 
934-936 938 941 945-946 948-950 
953-554 959-962 966-969 977 979 
981 986-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-1002 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1188 
1190 1193-1194 1196-1197 1199 
1204 1208-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1255 1258 1261- 
1262 1269 1274 1279 1281 1283 
1285 1287-1289 1294-1295 1305 
1307 1313-1314 1316-1320 1325 
1332 1341-1342 134S 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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Tissue Origin | RjtfA Source 



Hyseq 
Library Name 



1TKQ ID NOS: 



1423 
1441 
1454 
1468 
1483 
1499 
1522 
1542 
1S55 
1580 
1593 
1610 
1624 
1639- 
1654- 
1672- 
1693- 
1717- 
1733 
1755- 
1777 



"1429 
1443 
-1455 
1470 
1485 
1502 
-1523 
1546 
1563 
1583- 
1595 
1612 
1626- 
-1640 
•1655 
•1673 
1695 
1720 
1735- 
1758 
1778 



^1431 
1447 
1457 

-1471 
1493 

-1503 
1525 

-1547 
1565- 

-1586 
1598 
1614- 

-1627 
1642 
1658- 
1676- 
1701- 
1723 
1741 

1762 

1786 



1435 
-1449 
14S9 
1475 
-1494 
1505* 
1528 
1549 
-1567 
1588 
1600- 
-1616 
1630- 
1644 
•1659 
•1681 
•1702 
1724 
1743- 
1765 



^1436 
1451 
1463 
1479 
1496 

-1507 
1531- 

-1550 
1569 
1590 

•1601 
1619 

-1633 
1647 
1664- 
1685- 
1704 
1726- 
1744 

1771 



1439^ 
-1452 
-1465 
1482- 
1490- 
1509 
: 1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 
1665 
•1688 
1708 
1728 
1752 
1774 



Columbia 
University 



IB2003 



infant braiiT 



Columbia 
University 



"IBM002 



infant brain 



Columbia 
University 



IBS001 



17-18 20-23 29 34 43 60 68-69 

78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-281 286 290-232 295-300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540 551 563 572- 
576 585 587 590-591 593 59S-596 
601 606 612 616-617 $20 622-624 
650 652-653 661 665 670-671 674- 
675 678 689 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 611 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-1289 1305 1314 1327 1333 
1344 1347 1350 1356-13S7 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 1557- 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-16Q4 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 

101 113 13 9 152 2*0 279 290-292 " - 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



10 12 119 175 279-281 321 334 

371 446 551 563 623 652 667 669 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



471-412 819" $49' 9^6 1113 1130 
1151 1188 1193-1194 1196 1229 
1258 1265 1271 1287 1317-1319 
1324-1325 1342 1423 1440-1441 
1448 1471 1482 1525 1532 1546 
1562 1569 1588 1591 1610 1618 
1647 1649 1658 



lung, 
fibroblast 



Strategene 



LFB001 



"5~9 17 "20-21 25 6B-69 82 94 105 
153 157 197-198 203 207-208 212- 
213 223 262 266 233 302 321 326 
333 356 370 427 430 436 446 462 
472 493 498 503 516 519 527 535 
537-540 542-544 562 565 567 586 
599-600 607 615 630 647 662-664 
692-694 712 719 745 748 775-777 
794-796 810 837 843-847 849 854- 
856 869 676 903 934 953 955-956 
964 975-976 984 1000 1005-1007 
1024-1025 1033 1039 1053 1064 
1070 1072 1082 1112-1113 1134 
1136-1138 1140 1195 1223 1232- 
1233 1246 1279 1285 1295 1311 
1320 1334-1335 1343 1427-1428 
1446 1478 1492 1493 1504 1537 
1552 1555 1567 1575 1582 1598 
1620 1625 1632 1638 1645 1654- 
1655 1662 1680-1681 1684 1686 
1690 1696 1702 1711 1733 1741 
1760-1761 1778 1785 



LGT002 



lung tumor 



invitrogen 



5-10 18 20-21 29 33-36 40 43 52 
54-55 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-119 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-156 159 161 
164 169 171 179-180 185 190 192 
194 196-199 203-208 210 212-214 
216-217 219 222 233 240-241 244 
246 251-252 255-256 261-262 266. 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-381 384 389-350 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-468 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 572-576 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-633 644 647 649 651 655-656 
660 662-665 667 669 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 7S0 752 
763 765-766 773-778 784-785 787- 
789 791 800 802-803 809-812 814 
824 826 628-829 832 838-839 841- 
845 849-850 852-855 857-861 864 
866 874 878-880 882 887 890-891 
897-898 902 904 906-907 910 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 955-956 
961 963 966-967 969 971 977-979 
981 984 986-987 990 992-993 995 
997 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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Tissue Origin 



lymphocytes 



RNA Source 



Hyseq 
Library Narae 



SEQ ID NOS: 



104S 
1059 
1074 
1097 
1116 
1139 
1152 
1172 
1202 
1222 
1257 
1278 
1289 
1317 
1344 
1357 
1383 
1403 
1431 
1448 
1470 
1488 
1508 
1519 
1540 
1561 
1591 
1602 
1624- 
1644- 
1656- 
1671 
1685- 
1705 
1730 
1748- 
1767 
1778- 



1047 
1063 
1078 
1104 
-1117 
1141 
-1153 
1178 
1204 
1227 
-1258 
1280 
1295 
-1321 
•1346 
1365 
1385 
1408 
1433- 
14S4- 
1474 
1490- 
1509 
1523- 
1546 
1565 
1593- 
1608 
1625 
1645 
1662 
1673- 
1688 
1709 
1735 
1749 
1770- 
1779 



^1050 
-1064 
1085 
1106 
1119 
-1142 
1156 
1195 
1208 
1234 
1265 
-1281 
1300 
1329 
1349< 
-1366 
1394 
1417 
-1436 
•1455 
1480- 
•1491 
1511- 
1524 
1549- 
1567 
1594 
1614- 
1627- 
1647- 
1664 
1675 
1690- 
1716- 
1739 
17S3 
1771 
1786 



1052 
1067 
1087 
-1107 
1126 
1144 
-1158 
•1196 
1214 
1241 
1267 
1283 
1305 
1338 
1351 
1369 
1397 
1419 
1438 
1460 
1481 
1494- 
1512 
1528- 
1550 
1569 
1596- 
1616 
1632 
1649 
1666- 
1678 
1692 
1717 
1741 
1760- 
1773 



1054 
1071 
1089 
1109 
1134 
-1145 
1167 
1198 
1216 
1247 
•1270 
1285 
1308 
1339 
1353 
1378 
1400 
1423- 
1444 
1466 
1483 
1496 
1515- 
1529 
1555 
1575 
1598 
1618 
1636 
1652- 
1667 
1679 
1696- 
1722 
1743- 
1762 
1775 



1055 
1073- 
1095- 
1112 
-1135 
1148 
1170 
-1200 
1219 
1252 
1276 
1288- 
1312 
1341 
-1355 
-1379 
1402- 
•1426 
1446- 
1468 
1486- 
1506 
•1516 
1536- 
1560- 
1588 
1600- 
1620 
1639 
16S3 
1670- 
1683 
1699 
1727 
1744 
1765 
1776 



ATCC 



LPC001 



4 11-12 18 24-25 30-31 48 50-51 
56-57 68-69 80 92 98 103 105 110 
126 137 152-153 157 165 172 188- 
189 197 203 210 217-218 222-223 
225-226 229 231 247 251 256 264 
272 280-281 284 300-301 321 325- 
326 339 348 352 357 371 382 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-448 451 454- 
455 475 503 516 526-527 530 537- 
540 549 556-560 553 574 577 589 
602 613 615-617 621 623 628-630 
636-637 647 649 657-659 690 697 
717 723 755 764 775-777 780 786 
789-790 793 800 802 822 838 849 
866 869 876 881-883 892 898 906- 
907 911 921-923 928 975 990 992 
996 1001 1004-1007 1033 1050 
1054 1078 1107 1135 1140-1141 
1143 1148 1158 1163 1177 1199 
1205 1216 1226 1231 1236 1241 
1244 1250 1258 1260 1265 1269- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1369 1379 
1381 1383-1384 1386-1387 1389 
1394 1397 1405 1423 1425-1428 
1431 1437 1446 1448 1461 1466 
14 70 1472 1474 1482 1492 1506 
1528 1537 1546 1549 1591 1598 
1600 1603-1604 1606 1627 1636 
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Tissue Origin 



RNA source 



Hyseq 
Library Name 



SEQ ID NOS: 



1638 1647-1649 1651 1658-1659 
1664 1676-1677 1680-1681 1687- 
1688 1699 1711 1715-1716 1726 
1728 1737 1740 1746 1748 1752 
1756 1758 1777 1773 



leukocyte 



GIBCO " 



I.UC001 



3-4 10-11 13 15-18 20-21 24-25 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
212-215 217-219 222-223 229 235- 
236 247 251 255-258 260 262 272 
274-277 280-281 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-451 453-454 456 459 461- 
464 468-472 474-479 481 483-485 
487-491 496 499-501 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 553-559 
566-567 571 574-577 579 582 584- 
586 589 593 595-597 601-602 604 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 655 
659-660 662-665 667 669 674-675 
678 6B2-6B4 692-S96 698 700 706 
708 710 716-720 725-726 729-736 
738-739 743-746 749 751 753 756 
759 765-766 768 770-77B 780 784- 
786 788-790 793 796 798 800 802- 
803 810-811 814 817 819 826 829- 
830 832 834-836 838 843 845-860 
863-864 866-871 877-879 881-892 
894-896 898 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949 953 955-956 958 960- 
962 964 967 970-971 973 975 977 
985-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107 1110-1113 1115-1117 
1122-1123 1125 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 12S4-1265 1269- 
1270 1272-1275 1277 1280-1284 
1287-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 
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Tissue Origin 



leukocyte 



RNA Source 



Clontech 



Hyaeq 
Library Name 



"LUC003 



SEQ ID NOS: 



1453 

1470- 

1488 

1506 

1521- 

1531 

1549- 

1565 

1594 

1608 

1626- 

1639 

1653- 

1670 

1692 

1711 

1727 

1744 

1762 

1784 



1458- 
1471 
1490- 
1509 
1522 
1534 
1550 
1567 
1596 
1611 
•1629 
1641 
•1655 
1675 
1696 
1716 
1733 
1748 
1765 
1786 



1459 
1474 
1493 
1512- 
1524- 
153 B 
1553 
1575 
1598 
1614 
163J.- 
1644- 
1658- 
•1679 
1700 
-1717 
1737- 
•1749 
1769 



1463- 

1477- 

1496- 

1513 

1525 

1541 

1555- 

1580 

1600- 

1620- 

1632 

1645 

1660 

1684 

1702 

1720 

1738 

1752 

1771 



1464 
1478 
1501 
1516 
1527- 
1545- 
1556 
1589 
1602 
1621 
1636 
1648 
1662 
1688 
1707 
1723 
1741 
1755 
-1772 



1468" 
1482- 
1504 
1519 
1528 
-1547 
1560 
1591 
1606- 
1624 
1638- 
•1650 
1659- 
1690- 
-1709 
1725- 
1743- 
1760- 
1781- 



4 '35-3* 44-4* 61 6B 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1113 1119 1170 
1236-1237 1241 1275 
1357 1359 1377 1506 
1553 1591 1600 1613 
1628 1670 1676-1677 
1699 1733 1738 1772 



69 75 82 102 
244 280-281 
455 461 476- 
554 575-576 
-622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



2£ 35-3£ 43 80 104 126 128 150 
163 166 188-189 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



271 277 280-281 
345 351 372 380 
415-416 430 445 
461 490 499 503 
567 575-576 588 
660 665 734-735 
790 800 832 845 
883 887 905 914 
985 990 992 999 
1038 1050 1055 
1099-1102 1107 
1156 1163 1172 
1214-1215 1217 
1238-1239 1244 
1293 1311 1320 
1345 1355 1367 
1403 1406 1414 
1465 1521 1529 
1547-1548 1582 
1638 1647 1653 
1670 1680-1681 
1724-1725 1731- 
1761 



197 210 215 220 
310 317 336-338 
381 383 387 412 
448 454 456 467 
526 528 546 546 
601 613 615 647 
737 759 778 787 
856 859 869 878 
932 934 958 976 
1000 1025 1031 
1068 1074 1068 
1136-1138 1149 
1190 1195 1200 
1226-1227 1235 
1253 1278*1230 
1330 1334-1335 
1386-1387 1394 
1423 1437 1442 
1536 1539 1541 
1620 1626 1631 
1660 1667 1669- 
1696 1704 1715 
1732 1750 1760- 



^5"29 

64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 



mammary gland 



Invitrogen 



MMG001 



5-8 10 12 14 
33-39 42-43 
71 73-74 79- 
106 108 112 
146 148 150 
166 170-172 
188-190 194 
222 224 227 
251 253-254 
271 276-277 



18 20-21 24 
52 55-58 60 
80 82 89 98 
123 128 133 
152 154 158 
174 176 178 
198 201-206 
228 231 233 
256 261-263 
279-281 284- 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















290 7 


97 299 301 


■304 


309-312 318 








320-321 323-325 


327- 


329 331-332 








334 339 341 344 


-345 


348 350 356 








359-360 362-363 


368 


371 376 379- 








303 380 390 393 


-395 


397-398 405 








4C6 412 414-415 


423 


430 434-437 








441-444 448 451 


-455 


462-4 


64 474 








476 4 


79 482 485 


-486 


488 490 494- 








495 4 


98 503 506 


509- 


512 516-517 








519-520 522 S27 


529 


534 537-541 








.547 549 554 557 


562 


572-574 587 








589-S91 597 602 


607 


618 623 628- 








629 632 62 


4-640 


644 


647-648 650- 








652 655 657-658 


660 


665 € 


67 669- 








672 674-676 679 


682 


688 695-696 








706-707 710 713 


717 


720 722-730 








732-734 736 738 


743 


747-748 750 








755 759 761 766 


770 


780 784 786- 








789 794 803 806 


-807 


809 814 817- 








822 827-829 837 


842 


8S4-858 863- 








•864 866 869-870 


872 


878 881 889 








893-900 904 906 


-907 


911 916 919 








921-923 926 935 


-937 


946 948-949 








953-954 957 960 


-961 


963 965-966 








970 977-978 984 


-989 


993-997 








1000- 


1001 


1005- 


10C6 


1008 


1013- 








1014 


1016- 


-1017 


1023 


1025 


1027 








1032- 


-1033 


1036 


1039 


1043 


1045 








1055 


1057- 


-1058 


1063 


1058- 


-1075 








1077- 


•1078 


1085 


1087 


1089- 


-1091 








1095- 


-11Q2 


1107- 


1108 


1112-1119 








1121- 


1123 


1131- 


1133 


1136- 


-1137 








1139-1142 


1144- 


1145 


1148- 


-1149 








1153 


11S9 


1167 


1170 


1172- 


-1173 








1183- 


-1185 


1190- 


1192 


1196-1199 








1207-1208 


1212 


1216-1218 


1222- 








1223 


1225 


1231 


1234 


1240-1241 








1247 


1253-1254 


1258-1259 


1261- 








1262 


1270- 


-1280 


1283 


1285- 


-1286 








1298 


1307 


1314 


1316- 


-1320 


1323- 








1325 


1330 


1334- 


1335 


1342- 


-1345 








1349 


-1352 


1354- 


1355 


1359 


1369- 








1370 


1377 


1379 


1381 


1383 


-1384 








1389 


1405 


1414 


1419 


1421- 


-1423 








1425* 


-1426 


1428- 


1429 


1431 


1434- 








1437 


1439 


1448- 


1449 


1454 


1457 








1460-1464 


1466 


1471 


1480-1483 








1487 


1489-1491 


14 93 


1505 


1507 








1512 


1519 


1526- 


1526 


1532 


1534 








1536 


1539 


1542 


1547 


1549-1550 








1554 


1561-1562 


1564 


1567 


1572 








1576 


-1579 


1581- 


1582 


1587 


-1588 








1592 


1594 


1596- 


1597 


1601-1602 








1607 


-1608 


1610 


1612 


-1616 


1618 








1621 


-1622 


1625- 


1626 


1631 


1635- 








1636 


1641 


1643- 


1644 


1647 


1650 








1652 


1654 


-1655 


1657 


-1658 


1660 








1662 


1664 


-1666 


1669 


-1671 


1673- . 








1674 


1676 


-1677 


1680 


-1685 


1689- 








1692 


1701 


1706 


1713 


-1715 


1719- 








1720 


1723 


-1728 


1730 


-1732 


1738 








1740 


1742 


-1744 


1746 


-1747 


1749 








1751 


1753 


1760- 


1762 


1765 


-1768 








1771 


1774 


1776-1777 


1779 


1783- 








1784 


1786 










induced neuron 


Strategene 


NTD001 


29 35-36 


30 116 123 


156 163 181 


cells 






214 


230 280-281 284 


-285 307 321 








330 


340 358 371 375 


377 380 382 








422 


424 492 497 532 


-533 542 546 
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Hyseq 
Library Name 



SEQ ID NOS: 



retinoid acid 

induced 
neuronal cells 



Strategene 



NTR001 



neuronal cells 



Strategen* 



pituxtary 
gland 



Clontech 



NTC7001 



PITQ04 



549 566 58 b S9^ 4l2 645-647 654 
734 775-778 780 752 799 821 826 
856 858 875 936 953 985 990 992 
1041-1043 1055 1072 1104 1193- 
1194 120C 1223 1246 1253 1274 
1288-1289 1291 1294 1311 1320 
1349 1359 1412 1423 1485 1620 
1623 1645 16 B 4 1705 1715 1 751 
5-8 78 268-269 277 383 431 506 " 
623 677 731 999-1000 1199 1425- 
1426 1547 



29 65-66 80 82 110 119 146 152" 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



311 314 379 408 419 43 0 454 10S5~ 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



Clontech 



PLAOO* 



prostate 



Clontech 



5-8 124 20B 277 370 843 906-907 
1280 1317-1319 1369 1609 1621 
1737 



PRT001 



9 46 57 71 107 147 171 177 197 ™ 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 9B8-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1S27 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



Invitrogen 



REC001 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 843 849 8S1 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 122S 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
jj5l 1355 1369 1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 14 74 1477 








1482 1546 1587-1588 1592 1596 








1610 1622 1627 1644 1658 1662 








1665-1666 1669 1675-1677 1749 








1786 


salivary gland 


Clontech 


SAL001 


10 55 97 103 110 140 149 152 158" 








198 217-218 242-243 256 301 308 








312 321 333 351 354 360 410 437 








448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








1055 1066 1103 1150 1172 1181 








1234 1281-1282 128B-1289 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 1482 1492 1494 1498 1511 








1523-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 1658 1665 








1671-1672 1691-1692 


salivary gland 


Clontech 


S ALSO 3 


158 326 1423 1463-1464 


skin 


ATCC 


SPB001 


1320 1400 


fibroblast 








skin 


ATCC 


SFB002 


262 736 1025 1253 


fibroblast 








skin 


ATCC 


SFB003 


709 1119 1350 1631 1653 


fibroblast 








small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 286 288 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 426-427 430 








434-435 445 452 454 478 503 516 








519 521 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


skeletal 


Clontech 


SKM001 


18 20-21 B2 84 101 118 134 148 


muscle 






151 153 166 225-226 258 274 277 








289 329 361 412 414 424 440 452 








459 470 488 503-504 537-540 647 








660 673-675 71S 773 780 786 830 








905 922 950 963 982 990 992 1020 








1047 1063 1115-1117 1121 1134 








i^oft 1 5<;fl 1 9tta loon i^oi i77o 
i**o ltoo x<yo u^i xj«? 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


clontech 


SKMS03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKMS04 


235-236 


muscle 








spinal cord 


Clontech 


" *Jpc6oi 


4 9 11 17 30-31 35-36 43 46 60 
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tissue Origin 



adult spleen 



Source 



Hyseq 
Library Name 



SEQ ID NOS: 



82 AS 92 94 106 110 116 
167 198 204-205 210 215 
259 277 280-281 300-302 
317 372 379 387 392 419 
430 433 448 467 473 487 
509 513 519 524 526 537- 
547 549 551 559 567 569 
607 616-617 623 625 637 
652 657-658 670-671 673 
682 709 711 715 719 728 



749-750 753 
809 820 832 



77S-777 781 
834-836 847 



855 858 861 864 871-872 
898 906-908 917 919 924 



944 970 985 990 992-993 
1039 1053 1059 1065 1072 
1077 1082 1085 1097 1103 
1116-1117 1128 1134 1151 
1174 1192-1194 1215 1225 
1243 1283 1294 1307 1312 
1323 1327 1330 1350 1353 
1356 1359 1368 1375 1400 
1407 1423 1429 1437 1443 
1454 1470 1482 1492 1501 
1511 1529 1538 1548-1549 
1571 1578 1598 1600 1614 
1627 1630 1639 1646 1651 
1670 1686 1696 1740 17S1 
1771 



139 157 
229 256 
304 315 
426-427 
489 506 
540 543 
■570 593 
649-650 
679 681- 
•729 734 
709 791 
849 854- 
875 884 
934 942 
998 1013 
1075 
1109 
1170 
1241 
1320 
1354 
1406- 
144B 
1508 
1565 
1625 
-1652 
1755 



Clontech 



SPLcOl 



stomach 



Clontech 



STO001 



11? 312 32<T 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



thalamus 



10 15-16 61 68-69 100 117 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 6S1 G62-664 722 739 
7B0 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1S57-15S9 1622 
1634 1651 1653 1729 



Clontech 



THA002 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-36S 379 
388 393 396 419-420 441-442 458 
477 483 S08 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



thymus 



Clontech 



THM001 



44-4£ 54 $7- 
126 134 153 
243 258 274 
327 330 333 
430 445 465- 
493 503 506 



58 62-64 79 104 123 
193 212-213 218 242- 
277 279 297 301 307 
342 351 358 371 410 
466 468 471 483 487 
509 517 526 535 537- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



~540 546 548 554 567 584 S86 590- 
591 604 612 621 638-640 645-647 
649 656 660 665 670 698 710 720 
728 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 13S3-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1536 1554 1620 1644 1646 
1649 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



thymus 



Cloncech 



THMC02 



5-9 15-21 25 33 35-3* 43-45 48*"" 
50-51 54-55 60 75 63 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 195 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 486 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-B91 89B 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 1S45 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1756- 
1761 1771-1772 1779 1786 
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Tissue Origin 



thyroid gland 



trachea 



RNA Source 



Clontech 



Clontech 



Hyseq 
Library Name 



THRO 01 



TRC001 



SEQ ID NOS: 



4 9-10 20-21 37-39 48 50-51 £4- 
57 60-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 1S2- 
153 155-158 163-164 1C8-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-269 277 280-281 
284-286 288-289 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 S64 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 69B 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 823- 
824 826 828 833 838 841-845 847 
849 857-860 867 874-875 878 88C- 
881 887-888 890-892 894-895 898 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 9S3 957 961 963-964 966 97B- 
979 981-982 987 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 103B-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 1428*1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461*1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 1S53 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1766 



* 29-31 46 48 87 104 107 110 13& " 
158 222 262 266 286 301 318 331 
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Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



uterus 



Clontech" 



UTR001 



TRADOCS:14I6191.!(%CQN01!.DOC) 



352 372 377 384 414 424 445-444" 
454 472 474 491 496 560 579 588 
593 597 607 6X2 626 681 702 719 
810 859 866 B78 894-895 912 916 
922 932 935 1046 1075 1080 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1569 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



17 19 25 41 46 57-58 61 89 104 — 
108 139 152 174 198 200-201 206 
263-265 274 290 387 408 420 438 
446 448 452 473 491 493 499 503 
506 513 519 522 526 530 542-543 
560 601 610 632 659 665 720 751 
773 780 833 845 857 872 877 912 
929 934 937 996 1009-1011 1018 
1050 1075 1107 1124 1170 1219 
1258 1279 1287 1310 1320 1323 
1343-1344 1375 1437 1451-1452 
1478 1481 1498 1519 1521 1536 
1552 1579 1S97 1602 1606 1620 
1626-1627 1649 1652 1661 1670 
1719 1722-1723 



142 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



ID 
NO: 


NUMBER 




DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1 


Y41736 


sapiens 


numan fkuxjljli protein 
sequence . 


1398 


100 


2 


Y66656 


sapiens 


iTCuujx cuie - uouna pr o t e in 
PR0943. 


23 89 


99 


3 


AF113136 




j.i->~ x ieueptor-associatea- 
kinase-M; IRAK-M 


3043 


100 


4 


AP017806 


Mus musculus 


Zn-15 transcription factor 


6351 


77 


5 


X02761 




fibronectin. precursor 


10535 


98 




X02761 


Homo sapiens 


fibronectir. precursor 


8990 


89 


B 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 IS -encoded protein. 


2381 


100 


11 


AP117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G))) 


896 


100 


13 




Homo sapiens 


Protein regulating gexie 
expression PRGE-13 . 


1894 


98 


14 


AT a IJ^J / 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


16 




Homo sapiens 


RACK- like protein PRKCBPl 


3124 


99 


17 


AF201303 


Homo sapiens 


dhfr onbeta-binding protein 
RIP60 


3130 


9B 


18 


AF064205 


Homo sapiens 


dynactin l pi50 isoform 


"'6377 


100 


T Q 


uooo^y 


Saccharomyce 
s cerevisiae 


Ynrl21wp 


174 


26 




ABU32903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1601 


99 


9 1 

cx 




Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1485 


99 


22 


nr X *k \}p U / 


Homo sapiens 


Ca2 +/ caimodul in- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140507 


Homo sapiens 


Ca2+/calmodulin-dependent 
protein kinase kinase beta 


2300 


99 


24 




ncjutu sapiens 


chondroitin 4-o- 
sulfotransf erase 


2211 


99 


'25 


"U33460 — : — 


sapiens 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y444 88 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


U43 701 


Homo sapiens 


ribosomal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 " 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 


1083 


99 


30 " 


W71749 


Homo sapiens 


Human ublquitin conjugation 

system protein Z . 


715 


90 


31 " 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . 


431 


82 


32 


AF231917 


Homo sapiens 


long- chain 2 -hydroxy acid 
oxioase HAOX2 


1811 


ioo 


"33' 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1S07 


99 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 - 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (az-2) amino 
acid sequence. 


4726 


99" 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSIOir 
NUMBER 


SPECIES 


bESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% - 
IDENTITY 


39 


Y78795 


l^VlltV/ 0 C* X vlitJ 


acid sequence , 




77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-i 


3747 


100 


41 


Y42750 


Homo sapiens 


nuiuaii waj.Lj.um oinaxny prOCElH 

1 (CaBP-1) . 


795 


100 


42 


AP282626 


HoTHrt A T> 1 An4 


1 » l*0yi Y\ 


1189 


100 


"43 " ■"" 


G02150 




xiuiuciii bcv(cccQ protean, 5c«v 
ID NO: 6231. 


384 


94 


44 


U19617 




El£-1 ~ 


2724 


88 


45 


U19617 


Miia ntli^fMil tin 

HUS I1IU0 ^Uii UJ 


El£-1 ' ■ ~ 


2062 


66 


46 


AF100758 


Homo RAi?iif»nQ 


ustcoinauctive caccor uir 


1536 


100 


47 


Y87591 


Homo Rpni<*ftQ 


nuiuan o tr RUU 1 I — 1 pZTOCem, oCilJ 

ID NO:24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 

1SUJ 


942 


99 


51 


X63S47 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


1 KaluUS 

norvegicus 


raD-reiateo bTP-Dinaing 
protein 


1089 


96 


53 


L3 17B3 


riua luuecuiuy 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


99 


33 




Homo sapiens 


chloride channel protein 7 


4128 


99 




W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


3 / 


Z509U7 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


U79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosura. 


60B9 


99 






Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


4014 


91 




Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing CXXC 
domain 1 


1390 


100 


62 


I D DODU 


Homo 
sapiens 


Membrane -bound protein 
PR0783. 


2492 


99 


63 


XDOOOU 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 


64 


S70011 


KatLuS sp . 


tricarboxylate carrier 


895 


55 


65 


AF13951B 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


DO 




Homo sapiens 


Homo sapiens DH1308_1 clone 
secreted protein. 


157 


30 


6*7 


*%AJ 4l *x /JO 


rnjiuL) sapiens 


ciauain-io 


1206 


100 


68 


AFQ9913 8 


Rattus 


GLUT4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 


GLUT4 vesicle protein 


4906 


86 


70 


Z82059 


Caenorhabdit 
is eleaans 


Similarity to Drosophila ring 

nana "I nrnHAln ^nmoo f 1 rr\m 

this gene 


1285 


44 


71 


AP224278 


Homo sfini f»r) «j 


rnarnj. protein 


1282 


100 


72 


AF126426 




UCUiULL llUUi 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence . 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein-l 


1485 


74 




AE000406 


Escherichia 
coli 


putative DNA topoisoraerase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 




Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


79 


AF129756 | Homo sapiens 


G4 


1554 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES " 


DESCRIPTION 


SMITH- 

U&TCDM&M 
WAlcKP'lAN 

SCORE 


IDENTITY 


BO 


" AL096768 


Homo sapiens 


" d0f8*8B16.2 • 

(phosphatidyl serine 
decarboxylase (PSSC, EC 
4.1.1.65)} 


2033 


inn 


81 


AL096768 


Homo cani on e 


CUJoooolb . 2 

(phosphatidyl serine 
decarboxylase (PSSC, EC 
4. 1.1. 65)) 


1220 


96 


82 


XS7351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984 1 


2700 


98 


84 


X731I3 - 


Homo sapiens 


fast MyBP-C 


5959 


99 


85 
86 


AF097330 
~ AB018423 


Homo sapiens 
Mus mus cuius 


HI chloride channel? p64Hl; 
OCIC4 

SH2 domain- containing protein 


1305 


99 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


1360 
3084 


78 
" 99- 


68 
'89 - 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes X 


1214 


■ ioo 


"90 


AB016879 


Arabidopsis 
thaliana 


contains similarity to pre- 
mRNA splicing 
factor~gene_id:MRB17 .2 


634 


36 


91 


AJ133721 
AJ242864 


Mus musculus 
Mus musculus 


homeodomain protein 
phtf protein 


~^54 

619 


' "til 
61 


92 
93 


A61971 

VQQlCC 

z y y j od 


unidentified 
Homo sapiens 


MCSP " ~ 

Human PRO1250 (UNQ633 ) amino 
acid sequence SEQ ID NO: 86. 


11676 
3890 


99 
100 


94 


Y87231 


Homo sapiens 


Human signal peptide " 
containing protein HSPP-8 
SEQ ID NO: 8. 


1031 


100 


95 
"96 


AF227741 


Rattus 
norvegicus 


protein kinase WNkl 


2428 




9i~ 


AF227741 


Rattus 
norvegicus 
Homo sapiens 


protein xinase WNK1 
Human OXRE-io. 


1961 


"94 " 


98 
99 


AL021366 
AC005783 


Homo sapiens 
Homo sapiens 


CICK0721Q.3 (Kinesin related 

protein) 

R33083 1 


1626 
3423 


100 
100 


100 


Y9S293 


Homo sapiens 


"Human GEF containing NEK-like 
kinase substrate sGNX. 


1575 

4092 


~5<S ' 

99 


101 
102 


ALII 8 S01 
ACT0062S7 


Homo sapiens 
Homo sapiens 


dJH9iM16.1 U novel protein — 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 
ClpX-like protein 


1509 


100 


103 
104 


AF100753 
AB015982 


Homo sapiens 
Homo sapiens 


ancient ubiquitous 46 kDa 

protein AUP1 

serine/ threonine kinase 


3233 
2042 " 

4718 


Too 

96 
100 


105 
106 

io1 ■ 


AF151074 
M35522 


rnJUHJ bapiCJlS 

Can is 

familiaris 


n£PC24Q 

OTP -binding protein <rab7> 


831 
354 


64 
50 




R99800 


Homo sapiens - " 


NTll-i nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 

109 
110 


AF125533 

Ac!du5*i4 
AP064729 


Homo sapiens 

liomo sapiens 
Homo sapiens " 


NADH- cytochrome bs reductase 

isoforra 

F23269 2 

RAN binding protein 16 


1290 
3369 


93 
99 


111 
112 

113- "-- 


X52425 
Y41686 " '" 


Homo sapiens 

Homo 

sapiens 


interleukin 4 receptor ' " 
riuman PR0274 protein 
sequence . 


3285 
4496 
2285 


100 
100 
100 




W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERKl . 


1991 


ioo "~ 


114 ■■- 

Tis 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


4n — 

Tl7 


SE043548 

AF189817 
W30891 


Homo sapiens 

Mus musculus 
Homo 


dJ398G3.1 (ortholog of rat 

CPG2) 

jvectin-2 

*uman cytostatin Lit protein. " 


3497 
1124 

m 


99 
90 

99 " 
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SEQ 
ID 
NO: 


ACCESSION"" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WAT2RMAN 
SCORE 


% 

IDENTITY 






sapiens 








118 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


""'1748 


" ^Xo<) " — 


12C 


' AF098070 


Drosophila 
melanogaster 


Lisl homolog 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p80 subunit 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix, remodelling. 


2637 




123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63105 


Leishmania 
major 


glycoprotein 36-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu ~ " 


935 


Jo 


127 


Z6B220 


Caenorhabdit 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W929S8 


Homo sapiens 


Human zsia4 4 Qrotein 


H O J 


100 


130 


AF115391 


Lactobacillu 
s sakei 


ribokinaoe RbsK 


Jus) 


37 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 




^x-kjxuuamxc Aoaa-Kicn Protein 


916 


87 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y844 44 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
procein . 


3230 


100 


135 


M69181 ' " 


Homo sapiens 


non- muscle myosin B 


189 


20 


136 


W74882 




Human secreted protein 
encwea oy gene xs>* Clone 
HE6FL83 . 


480 


100 


137 


W78200 


Homo sapiens 


Human secreted protein 

encoded hv a&r\t* m. r* 1 /Nr>«* 
cuviuucu kjy ijciiG /3 Clone 

HHGAUB1. 


855 


99 


13 8 


AL033520 


Homo sapiens 


dJ349A12.1 (similar ho ~ 
KIAA0701 protein) 




39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


30 ~j 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


itfo " 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


QIC ~ " " 


100 


142 


Z68493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


365 


42 


143 


AB018107 


Arabidopsis 
thaliana 


ADP-ribosylatlon factor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


"146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


F3P19.18 


647 


31 


148 " 


W75155 


Homo sapiens 


Human secreted protein 
encoded bv oene 41 clonp 
HNTME13 . 


1494 


98 


149 


AF05649D 


Homo sapiens 


cAMP-specif ic 
phosphodiesterase 8A 


3710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


785 


99 


151 


U10397 


saccharomyce 
s cerevisiae 


Yhrl4Svp 


S15 


■53 ■ 


152 


X73478 


Homo sapiens 


phosphotyroeyl phosphatase 
activator 


1719 


99 


153 ] 




dJ382I10.5.l (novel protein 


2034 


99 



146 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


T SPECIES 




SMITH- 
WATERMAN 


IDENTITY 








similar to arginyl- tRNA) 






1EU 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rao28 






156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6. 


"14 71 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 


937 


100 


159 


Y17248 


Homo sapiens 


Human orotpin kina<if» 
inhibitor-2 {PKI-2) . 


JDJ 


100 


160 


J04970 


Hrvnn earvH <=»rm 


Ld x. xjuAy pep t x ud s e pi precursor 


2395 


100 


161 


W54040 


Homo sapiens 


Human interferon -inducible 


484 


98 


162 


AL022724 


Homo sapiens 


dJ413H6.1.l (hamster 
j^uxogen-uepencieiic expire ss&q 
Protein LIKE putativs 
pxcteiii^ usoiorm u 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homo log 


193 " " 


45 


164 




nomo Sapiens 


Human secreted protein, SEQ 
iu r>v/ . / / ij . 


463 


97 


165" 


AJ250839 


Homo sapiens 


serine/ threonine protein 
kinase 


1442 


71 


166 


L09649" * 


zymomonas 

m/N Ui lit. 

moon i s 


zm2 


"173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


notions 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC7 1 . 


1084 


100 


"iS9 






Homo sapiens 


Aif-aepenaent RNA nelicase 


4402 


100 


170 




ne cuanoDac ce 
rium 

L- OC I. UltJct U I. U l_ JL 

ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo saoi s»no 


Human oor^af v\wxt» ^ <S 

numoixi oeciTcuea pcvtsin 

encoded by gene No. 11 S. 


621 


100 


172 


AP226044 


Homo sapiens 




2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 




111 it* gene is novel • 


3202 


100 


175 


Y07923 




ij 1 r -xjincLxny protein 


1205 


100 


"17'S 


W90338 


sapiens 


numan urx nomoiogus procein. 


966 


100 


177 


Y41^75 


Homo sapiens 


Human nhannsl m v*a1 at* e»A 
molecule HCRM-3 . 


1 1 ">0 ' 


1U0 


178 


Y41 6 74 


Homo sapi ens 


Human ^hannnl-rplahpH 
molecule HCRM-2 . 


0 


0 0 


179 


AF220492 


Homo sapiens 


Krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B-chain precursor 


1240 


100 


181 


U57344 


Mus mus cuius 


Meis3 


1813 


89 


183 


U57344 


Mus mus cuius 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meie3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


frfos 


82 


187 


W75058 


Homo sanienH 


Human kpptp^ F»r? nrnhpin 

encoded by gene 2 clone 
HLDBG33. 


1100 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium- binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


~T93 


W87772 

_ 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


1 SPECIES' 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


194 


AF0B4259 


Mus raus cuius 


bromodoma in- conta in ing 

Drokein RP7^ 


693 


54 


195 


Y00752 


Rattus 
norvegicus 


serine dehydratase (AA l - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70 1. 


2^96 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 




Homo aanien^ ^prrpi'pH r^-r-nt- *» n 

iiumw aa^ACllt) SC^LCLCu J^JlUUdll 

gene clone hm236 1 . 


16 14 


100 


199 


Y44277 


sapiens 


2. 




99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 




CL A Vr{ anfnanh i nan 
ot flu aULUaill. -Lycii 


2918 


99 


202 


G0206I 


Homo sapiens 


Human secreted protein, SEQ 

ID NO« £142 


558 


99 


203 


X1388S 


Nicotiana 
t aJbacum 


extenein (AA 1-620) 


185 


33 


204 


J04204 




i , — 

32 ku accessory protein 


1837 


100 


205 


J04204 


duo i.auiiUa 


34, kq accRBsory protein 


1101 


100 


567 ' 


V87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP- 6 0 
SEQ ID NO: 60. 


iaid 


100 


206 


Y02860 




Fragment o£ human secreted 
jjiULc-u encoaea oy gene ot> , 


936 


98 


209 


AL121889 


Homo sapiens 


dJ107^Ei7.1 (KIAA0823 protein 
iconcinues in Ab0^3o03M 


694 " " — 


54 


210 


AF226732 


numu oayiciiC} 




1345 


76 


211 


X66295 


Mi | q mi I an 1 1 1 1 es 


/V /"» V"* :3 * 

uiq v, cnain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 " 


229328 


riuiiuj sapiens 


Ubiquit in- conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 

V V \J *J \J 


iivjiuu sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 

Camliy 


3933 


100 


216" 


A£250£S8 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ82iDii.i (pdtative proteinr 


259 


100 


218 


vnoccc 


Homo sapiens 


UDP-GalNAc:polypeptide N- 
acetylgalactosaminyl transfers 
se 


3331 


99 


219 


"¥9T45~2" 


Homo sapiens 


Human inflammation associated 
ptycexn 


2067 


100 


220 


AL035S21 


thai iana 


pucative procein 


315 


42 


221 


AL031786 


Schi zosaccha 

romyces 

pombe 


synthetase 


Oil 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


"40 


222 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224 


AL035SS9 


Homo sapiens 


dJ979Nl.l' , {dJ979Nl.iJ "" ." 


Sl$9 " 


98 


225 


AB032401 


Mus musculus 


tnmDj4 


1761 




226 


AB032401 


Mus musculus 


mtnDj4 


1988 


92 


227 


X83502 


s cerevisiae 


J1007 " 


112 


2d 


228 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


79 


25 1,1 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


982 


100 


~23"l 


AB0274?? 


Homo sapiens 


spondin 2 


1756 


"99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human cycliri Bl. 


2218 


99 


■234"- ■ 


Y53762 


Homo sapiens 


A GTP-binding polypeptide 


1017 


160" 
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TABLE 2 
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SEQ 
ID 
MO: 


~ "ACCESSION ' 
NUMBER 


SPECIES 


DESCRIPTION 
designated RAQ. 


i>ni til" 

WATERMAN 


IDENTITY 


235 
236 


Z50749 
Z50749 


Homo sapiens 
Homo sapiens 


yeast sds22 homo log 
yeast sds22 horaolog 


1800 
1754 


100 
98 


237 


AB026491 


Homo sapiens 


PICKl 




100 


23 8 


" AJ270205 


Bntodinium 
cauda bum 


putative 

phosphatidyl inositol -4 - 
phosphate 5 -kinase 


XXn 


37 


239 • 


AB030189 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


"'93 


240 


W5653B 


Homo sapiens 


protein (HIP) - 


Tior " " " 
j loo 


99 


241 


W56538 


Homo sapiens 


protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 


243 


AF155107 


Homo sapiens 


ii *~ivciw*j / cini.xgen 


1005 


100 


244 


AL031320 


Homo sapiens 


dJ20N2.1 (novel protein 

to yeast ana 
bacterial cytosine 
deaminase) 


763 


99 


245 


U37026 


Rattus 
norvegicus 


sodxum channel beta 2 subunit 


162 


30 


246 


AL078599 


Homo sapiens 


uu?jik,o,i movci protein 
similar to C. elegans 

J * » ^ \ X z. . try XV oK) ) ) 


2391 


98 


247 


U32274 


Saccharomyce 
s cerevisiae 


"Vdr3 86wp; "CM: O.lT 


191 


37 


248 
249 


Y41719 
AB029434 


Homo 
sapiens 
Homo sapiens 


Human PROS 64 protein 
sequence . 
ghrelin precursor 


1079 


100 ' 


250 


X97831 


Rattus 
norvegicus 


s>ai.uxLine/ qwyxcoiniuine 


611 
246 


100 
38 


251 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


sapiens 


nuuian protein cxone HP02632, 


1876 


100 


253 


W59878 




™'* l ' u — accjuence or tne 
cDNA clone AIF-2 (HEBGM49) . 


765 


100 


"554 


AL354533 


Leishmania 
major 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1914 


95 


"256 


Y78113 


Homo sapiens 


Human cytokine signal " 

NO:l. 


2247 


99 


"257 


AX035539 _ 


Arabidopsis 
thaliana 


putative amino acid transport 
protein 


390 


27 


258 
"259 


W74787 " 


Homo sapiens 


Human secreted protein 

encoded hc\f rrono Co r»1 rma 

HHFHN61 . 


1171 


lOfi" 




"AL03S^89 


Homo sapiens 


dJ18 7J11 . 1 /novel nrol-pln ~" 
similar to protein kinase C 
inhibitors) 


an a 
y ft 


i nA 

100 


260 
2*1 


AE000909 
AL050131 


Methanobacte • 
rium 

therraoautotr 
ophicum 
Homo sapiens 


serine/ threonine protein 
kinase related protein 

hypothetical protein 


—- 

626 


5U 

100 


"262 
263 


AF019661 
AL035593 


Mus musculus 
Homo sapiens" * 


zeta proteasome chain; psmas 
cuteiOJS.i (novel protein) 


1214 ~ " 
821 


100 
100 


264 ' 


AL022318 ■ 


Homo sapiens 


DK1S0C2.3 (PUTATIVE novel 
protein similar to AP0BEC1) 


1072 


100 


265 


AP205940 


Homo sapiens 


endomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


aa5Q0L14.i (novel protein) 


789 


idd 


267 — 


AL034548 


Homo sapiens 


d^JH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein CBFW) 


1888 


99 
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SEQ 
ID 
NO: 
26B 


ACCESSION 
NUMBER 

AF161470 


1 SPECIES 
[ Homo sapiens 


DESCRIPTION 

HSPC121 


SMITH- 
WATERMAN 
SCORE 
1884 


IDENTITY 
1 98 


269 
270 

271 


AF161470 
X90763 

* AF207600 


[ Homo sapiens 
j Homo 
| sapiens 
| Homo sapiens 


HSPC121 ™ ' ~ 

HHa5 hair keratin type I 
intermediate filament 
ethanolamine kinase ~ 


1232 
2190 


99 
1 100 


"272 " 
273 


M32334 
AF161483 


1 Homo sapiens 
| Homo sapiens 


intercellular adhesion ' 

molecule 2 

HSPC134 


19S2 
143"6 


1 J.UO 

"1 6-1 


274 


Y530S2 


Homo sapiens 


Human secreted protein clone 
df202_3 protein sequence SEQ 
ID NO; 110. 


663 

~ con 

SO / 


j 100 


276 


Y77576 


Homo sapiens 


Human cytoskeletal protein 
(HCYT) (clone 2195418) . 


/ 


1 100 


277 


AF077042 


Homo sapiens 


30s ribosomal protein S7 
homo log 


~ toco 

X4D3 


1 100 


278 


Y94467 " 


Homo sapiens 


Human secreted protein clone 
cai06^19x protein sequence 
SEQ ID NO: 20. 


1619 


98 


279 
280 


Y68788 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 


2801 


*99 




Z75134 ~~ 

7.*7 1A. 


Can is 
| familiar is 


rod transducin " 


1816 


100 


281 

282 
283 
284 


AF249873 
At,050007 


1 Can is 
j familiar is 
1 Homo sapiens 
| Homo sapiens - 


rod transducin 
hypothetical protein 


"1718 

1395 
405 


9* 

106 
[98 * 


285 
286 

"287 


AF201931 
AF156102 
Y35897 

U88964 


Homo sapiens 
1 Homo sapiens 
Homo sapiens 

Homo sapiens 


~DC1 ' — 

ELL complex EAP30 subunit 
Extended human secreted ' 
protein sequence, SEQ ID NO. 
146. 

HEM45 : " 


1859 
1318 


1 99 
99 

1 K q 

99 


"288 ~ 
~289 


1 AL050143 


Homo sapiens 
Homo sapiens 


hypothetical protein 
telethonin 


923 
598 
~574 


100 "| 
100 

Too 
"Too 


~290 
"291 


Y66724 

AF034801 j 
AF034B01 | 


sapiens 


Membrane-hound protein " 
PR0836. 


2321 


"292 


Homo sapiens 
Homo sapiens 


Iiprin-alpha4 
^Tiprin-alpha4 


2565 "~| 
~2590 "\ 


"98 " 

Too 


294 


"¥73348 


Homo sapiens 


dJ889J228.1 {novel protein " 

(isoform 1)) 


1738 


100 


295 


L11672 J 


Homo sapiens 
Homo sapiens 


htrm clone 8396"5l protein 

sequence . 

zinc finger protein 




99 


295 
297 


AL035423 


Homo sapiens 


dj20i3.1 (brain mitochondrial 
carrier protein-l (BMCPl) ) 


1694 I 
1024 


79 


"298" 


AF1985s32 1 
AF161417 f 


Homo sapiens 


lymphoid enhancer binding " ' 
factor- 1 

HSPC299 ~' 

breast cancer metastasis- 

suppressor 1 


2173 


100 


299 


AF159142 j 


Homo sapiens " 


1147 1 
"123? | 


"a ft 

BO 


300 
301 


U26397 


Rattus 
norvegicus 


Inositol polyphosphate 4- 
phosphatase 


160 


30 


"302 


AF036145 j 
Z82022 | 


Homo sapiens 


meningioma- expressed antigen 
S 

GlcNac-l-p transferase 


3458 


100 


303 


AP2*9232 


Mus musculus ™ 


butyrophilin-like protein 
BUTR-1 


2067 [ 
27 i 


$9 
50 


304 

305 ^ 


AJ222644 j" 


Arabidops is 
thaliana 


asparaginyi-tRNA synthetase 


659 


50 


306 


AF054180 ~ 
&J272079 j 


Homo 
sapiens 
homo sapiens 


hematopoietic cell derived 
sine finger protein 
APOBEC-1 stimulating protein 


351 


79 


308 

309 j 


£44486 

U131891 1 


Homo 

sapiens j 
tiomo sapiens 1 


Human CPRW receptor "™ 

polypeptide. 

DNA polymerase mu 


3056 j 
1721 

2598 


100 
100 

LOO 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


310 


AF293335 


Homo sapiens 


p30 DBC 


1248 


92 


311 


AF176525 


Mus musculus 


F-box protein FBi*12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


' 959 . 


81 


313 


236715 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF20806B 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


100 


316 


Ybbbbb 


Homo 
sapiens 


Membrane -bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RA^R-i . 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putativa THl protein 


3013 


100 


322 


"AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 


323 


Y9S013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID NO: $6. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


19^6 


100 


325 


Y94944 


Homo sapiens 


Human secreted protein clone 
i>fl57^16 protein sequence 
SEQ ID NO: 94. 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


6728 


99 


327 


AF198S32 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


' 484 1 


94 


330 


Z75330 


Homo 

sapiens] 

>R65207 

R65207 02- 

MAR-1995 27- 

AUG-1993 

Human 

stromalin-i. 
[Homo 
sapiens 


nuclear protein SA-1 


6492 


99 


331 


AL008583 


Homo sapiens 


dJ327Ji6.3 (supported by 
GENSCAN, FGENES and GENEWISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


A^T271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the 
Eimeria tenella gene etlOO 


154 


26 


336 


Y85564 


Homo sapiens 


Human homologue or" UNC-"53 
(Hs-UNC-53/i) sequence . 


3386 


97 


337 


Y8S564 


Homo sapiens 


liuman homologue of UNC-53 
(Hs-UNC-53/l) sequence. 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 


98 


339 


Z66561 ' 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 




immunoglobulin heavy chain 


439 • 


84 
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TABLE 2 



SEQ 
ID 

NO: 


NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








VDJ region 






344 


U10281 


ouo scrota 


gastric mucin 


279 


24 


345 


AKO"00404 




unnamed protein product 


1177 


99 


346 


L225S7 ~ 


KaLUUS 
iiwi vcyjiuus) 


calmodul in-binding protein 


1949 


84 


347 


" L22557 


Rat t us 


T7 L 

calmodulin -ninding protein 


2363 


91 


348 


' AL049481 


thai i ana 


" AIGl-lilce nrnhA^ " 

HiUJ - a aims protein 


316 


30 


350 


AJ251516 


Mus musculus 


cysteine and his tidine -rich 
protein 


"1460 


99 


351 


AK024477 


Honirt aani on d 


riiuuuu /u protein 


1773 


100 


352 


U50133 


Homo sapiens 


anicyrin 


502 


33 [ 


353 




Homo sapiens 


unnamed protein product 


721 


100 


354 


Zk.Pi ct aon 


Homo sapiens 


HSPC302 


2623 


97 


3S5 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


355 


AF151029 


Homo sapiens 


HSPC195 


941 


91 


J => / 


AL022327 


Homo sapiens 


dJ355ClB.l (KIAA0027) 


1911 


100 


358 


V/7812B 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 

HOSBI96. 


1117 


100 




X03414 


Drosophila 
me 1 anoga s t e r 


Kr polypeptide 


316 


45 


360 


At 151079 


Homo sapiens 


HSPC245 


643 


100 


361 


Y538B6 


Homo sapiens 


A suppressor of cytokine 
signall ing protein 
designated HSCOP-6. 


S30 


41 


362 


AF254741 ' 


Drosophila 
melanogaoter 


Centaurin Gamma 1A 


661 


46 


363 


AF21346S 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF181562 


Homo sapiens 


proSAAS 


1314 


100 


J DJ 


AF181562 


Homo sapiens 


pro&AAS 


1024 


99 


*a c c 
Job 


U73200 


Mus musculus 


pll6Rip 


884 


82 


367 


AF263744 


Homo sapiens 


erbb2- interact ing protein 
ERBIN 


4973 


93 


~36T 


U37501 


Mus musculus 


laminin alpha 5 chain 


"S867 


72 




AFQ43695 


Caenorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


"549 


36 


370 


X /.3440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102 . 


1484 


99 


3 71 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 




Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3927 


100 


373 




— - j 

Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


f ormimino t rans f era se 
cyclodeaminase 


2717 


98 


375 


A95io<; 


unidentified 


RED ALPHA 


1202 


99 


376 


W74fl5 R 


Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 

HLQA552 . 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


35S6 


99 * ■ 


37B 


M14912 


Homo sapiens 


pol 


132 


86 


~T)9 




Homo sapiens 


PRO0518 


382 


100 


380 


X66363 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


381 


Y41699 


Homo 
sapiens 


Human PR0703 protein 
sequence . 


2362 


100 


382 


AF17449B 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7008 


98 


383 


U64608 


caenorhabdit 
is elegans 


coded for by c. elegans cDNA 
ykl73cl2.5 


244 


U ' 


384 


U50133 


Homo sapiens 


ankyrin 


502 


33 


385" 


AJ238b26 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 
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PCT/US00/34263 



SEQ 
ID 

NO: 

-IDT 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


JO/ 


ArzUoo45 


Homo sapiens 


BM-003 


1375 


99 




A3 / OiSJ, 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 

- \f6£ddA 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 




Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3386 


97 


393 


* AF178432 


Homo sapiens 


SH3 protein 


3700 


100 




At £ 2yy2Q 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


1616 


62 


39* 


AF181721 


Homo sapiens 


RU2S 


2254 


100 




Y69197 


Homo sapiens 


Amino acid sequence of a 
human beta IV- spectrin 
protein. 


1626 


98 


397 


U48238 


Mus musculus 


zinc finger protein neuro-d4 


749 


60 


J9o 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxogiutarate 
dehydrogenase similar to 
Q02218 (PID:gl3526l8) 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


10246' 


"62 


403 


AL133288 


Homo sapiens 


dJ671D7.l < similar to 
D . melanogaster CG5986 
protein) 


761 


100 


404 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3D 


888 


48 


405 


Z78013 


caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


3i 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY-REN- 3 6 antigen 


1168 


lflO 


4Uo 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


153B 


"99 


40$ 


Z18361 


Ovis aries 


trichohyalin 


184 


30 


410 


AF249744 


Homo sapiens 


RhoQEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 




Homo sapiens 


dJ310O13.7 (novel protein 
similar to H. roretzi HRPET- 
3} 


776 


98 




X57398 


Homo sapiens 


pmS protein 


6131 


"99 


415 


AB029826 


Homo sapiens 


3-methyIcrotonyi-CoA 
carboxylase biot in- containing 
subunit 


2961 




4l£ 




Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 




Leishmania 
ma J or 


possible t26fl7.21 


239 


35 




Ynfti no 


Homo sapiens 


Human PR0331 protein. 


330 


"29 


419 


(J15131 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


LinJc guanine nucleotide 


2363 


100 


421 


AF190635 


urosophila 
melanogaster 


anJcyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL137530 


Homo sapiens 


hypothetical protein 


433 


"54 


424 


XS3753 


Homo sapiens 


son -a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693" 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 
precursor j 


1084 


55 



153 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO; 



ACCESS ION 
NUMBER 



AF279144 



SPECIES 



Homo sapiens 



DESCRIPTION 



tumor endothelial marker 7 
precursor 



SMITH- 
WATERMAN 
SCORE 



1259 



IDENTITY 



5T 



429 



430 



431 



432 
433 



434 



437 
"438" 



YQ7829 



Drosophila 
melanogaster 



CG8312 gene product 



AF096897 



Homo sapiens 



CJ41387 



Drosophila 
melanogaster 



RING finger protein" 



pushover 



AF023674 
AFT4TW 



Homo sapiens 



Homo sapiens 



<3u protein 



AB006697- 



Homo 
^apiens 
Arabidopsis 



nephrocystltT 



septin 2-lijce cell division 
control protein 



thaliana 



Y94247 
AB04D672™ 



Homo sapiens 



cleft lip and palate 
associated transmembrane 
protein- like 



Human calcium binding protein 
hCBP . 



Homo sapiens 



UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl trans f era 



2201 



4442 



4021 



3783 



2284 



886 



1704" 



1075 



2T 



99 



47 



99 



100 



100 



42 



100 



63 



440 



Bos taurus 



R06463 



liomo sapiens 



tuftelin 



Derived protein of clone 
ICA13 (ATCC 40553) . 



285" 



3073 



99 



"442" 



443 



X53773 



Y66689 



Mus musculus 

RatUus 

norvegicus 



alpha-adaptln (A) (AA 1-977) 



alpha-c large chain (AA l- 



Homo 
sapiens 



938) 



4897 
3979" 



Membrane-bound protein 
PR01136. 



3299 



Polypeptide fragment encoded 
by gene 156. 
HSPC327 



98 



81 



99" 




449 
"450" 



AF161445 
268753 



Homo sapiens 
caenorhabdit 



1606 



100 



"4?1 



452 



W39160 



is elegans 
Homo sapiens 



ZC518.3b" 



951 



W8S727 



Homo 

sapiens 

Homo sapiens 



Human partial complement 
factor H protein fragment 3. 
Novel protein {Clone 
BM46 10) , 



155 



2799 



49 
32" 



99 



A bone marrow secreted 
protein designated BMS115. 



2810 



100 



456 
"4T7" 



AF240468 

zisoo* — 



Homo 
sapiens 



Homo sapiens 
Homo sapiens" 



Similar to a C. elegans 
protein in cosmld C14H10 



4069 



nicastrin 



CENP-E ' 



3687" 



13305 



100 



100 



99 



Homo 
sapiens 
Homo sapiens 



gamma- aminobutyric acid 
receptor beta-1 subunit 



2477 



100 



460 



461 



W67824" 



Homo sapiens 



Human secreted protein clone 
yd6l_i protein sequence SEQ 
ID NO:156. 



96T* 



AF163151 



Homo sapiens 



D87446 



Homo sapiens 



Human secreted protein 
encoded by gene 18 clone 
HSLFM29 . 



535 



dentin sialophosphoprotein " 279 
precursor j 



Similar to a C. elegans 
protein encoded in cosraid 
C27F2 (U40419) 



9196 



100 



100 



19 



99 



"46T 
-4§5" 



Homo sapiens 



AC002398 



Human secreted protein, SEQ 

ID NO: B125. 



486" 



AF0648^6~ 
AF223408" 



Homo sapiens 



F2596"5 1 



Rattussp. 



7acomp protein 
B99 



1018 



1845 



9T 



100 



84 



sapiens 



39" 



154 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 
466 


ACCESSION 
NUMBER 

AF2234 08 


SPECIES 


DESCRIPTION 

B99 


j SMITH- 
WATERMAN 
| SCORE 
| 2878 


* 

IDENTITY 
87 


467 
466 

469 


AF104415 
U53450 

AL031297 


Mus mus cuius 
Rattus 
norvegicus 
Homo sapiens 


gene trap locus -13 * 
Jun dimerization protein 1 
JDP-l 

d«J97P20.1 {novel gene) 


| 6336 
196 


91 

- -Tq — 


470 


AF2S7077 




eukaryotic translation 
initiation factor EIF2B 
subunit 3 


3564 
1274 


99 
" 95 


471 


L28125 


arise rina 


beta transducin-like protein 


284 


"~38 


472 
473 


Y84903 
AF144237 


Homo sapiens 
Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 
1 252 


"100 


474 


Y71213 


Homo sapiens 


jjUMi 1 protein 

Human irritable bowel disease 
related polypeptide XMX39. 


T838 


44 
100 


475 
476 


Y95006 
D38549 


Homo sapiens 
Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52. 
hal025 is new 


3411 
j 6533 


100 
99 


477 
478 


AP241230 
AL031534 


Homo sapiens 
Schizosaccha 
romyces 
pombe 


TAKl-binaing protein 2 
pucacive asparagine synthase 


j 3656 
'482 


160 
40 


4 79 
480 


L28125 
AF161544 


Podospora 
anserina 
Homo sapiens 


beta transducin-like protein 
HSPC059 


1 233 
[ 434 


— 

77 


481 
482 

483 


nx* & j a A H o 

Z38061 
AF161381 


Saccharomyce 
s cerevisiae 

Homo sapiens 


centaurin beta2 

mal5, stal, len: 1367, CAI: 

0.3, AMYH_YEAST P08640 

GLUCOAMYtiASE SI (EC 3.2.1.3) 

HSPC263 


I 3986 
|295 

1404 


99 
"23 

J.UU 


484 
486 
487 


AF223468 
X57S27 ~ 
Y19062 


Homo sapiens 
Homo sapiens 
Homo sapiens 


AD021 protein 

alpha I(VIII) collagen " 

39k3 protein 


1314 
4166 


100 
99 


488 


Y73373 


Homo sapiens 


HPRM clone 921803 protein 

sequence . 


2475 
55* 


100 
56 


489 


AL021918 


Homo 
sapiens 


Jbi4lU.i (fcruppei related Zinc 
Finger protein 184) 


4184 


koo 


490 
491 


X53 773 
U52426 


Rattus 
norvegicus 
Homo sapiens 


alpha -c large chain (AA 1- j 
938) 

GOK 1 


4675 


97 


492 
493 


AL359773 
AF22^14 ■ 


Leishmania 
naj or 

Homo sapiens 


possible threonine synthase 
ferroportini ""| 


1459 
702 


Ss 
45 


494 
495 


AF036977 


Homo sapiens 
Homo sapiens 


aa222El3 . 1 tnovel protein f 
with some similarity to 
Drosophila Kraken) j 
unknown — — — j- 


2929 
513 

1812 


100 
96 

100 


496 
497 


U93564 

Y91405 


Homo sapiens 


p40 

Human secreted protein T 
sequence encoded by gene 2 
5>cQ ID NO: 126 . | 


133 
357 


45 
100 


498 


AF069781 


Drosophila 
raelanogaster 


3em46-like protein 


653 


43 


499 


Yl*i601 




Human cell -cycle 
pbosphoprotein CECYP-2. | 


1658 


98 


500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


SOI 
502 


AWfc 7503 
AF262874 


Mus 

musculus 
Homo sapiens 


putative membrane- associated 

guanylate kinase 1 

nectin 3; PRR3 | 


205 " ■ " 
2856 


36 
99 


503 
504 
505 
^07 


AJ249732 
AF208661 
L09708 


Komo sapiens 
Homo sapiens 
Homo sapiens 


G8 protein 

BM-019 f 
complement component C2 j 


6"69 

1629 

4022 


100 
100 
100 


508 


X66285 " " 
D00189 


Kus musculus 

Rattus 

norvegicus 


Hci orf f* 

Na+,K+-ATPase alpha-subunit 


115 

5227 " 


43 

99 - 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


509 

"bin 
510 

511 

512 


Y94971 

A3O19038 
AB019038 
AB01903 8 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


fal71_l protein sequence SEQ 
ID NO:148. 

beta-1,4 mannosyl transferase 
beta-1,4 mannosyl transferase 
beta-1,4 mannosyl transferase 


2176 

781 
1347 


100 
77 

l60 


513 

514 - 
515 


X84908 
X5*2851 
AF1860B4 


Homo sapiens 
Homo sapiens 
Homo 
sapiens 


"pnosphorylase kinase 
peptidyiproiyi isomerase 
epidermal growth factor 
repeat containing protein 


1520 
5729 
650 
3046 


99 
99 
76 
99 


516 

517 


G03602 
U04706 


Homo sapiens 
Bos taurue 


Human secreted protein, SEQ 
id no* 

•KXJ HU a /OOO . 

50 kDa protein 


" 50* 


99 


518 
519 


G00653 
AF161475 




Human secreted protein, SEQ 
ID NO: 4734. 


1749 
530 


77 
100 


520 
521 


Y99366 
AF266852 


Homo sapiens 
Homo sapiens 


Human PR01475 (UNQ746) amino 
acid sequence SEQ ID NO: 88. 
PTPLA 


1368 
3394 


100 
97 


522 




A rchae 09 lobu 
s fulgidus 


protein (smel) 


1295 
153 


100 
20 


523 


AF5S224 9 


nomo sapiens 


variable region 


605 


57 


524 




norvegicus 


ARE1 


2950 


98 * " 


525 




Homo sapiens 


uenuiar homologue of the 
SV40 large T antigen. 


127* 


83 


526 


AF145^58 


Drosophila 
melanogaster 


BCDNA . GH1022 9 " 


320 


33 


527 ~ 




Homo sapiens 


putative Rab5-interacting 
protein 


524 


79 


523 
*29 


D49387 


Homo 
sapiens 


NADP dependent ieukotriene hi 
1 2 - hydroxydehy drogana s e 


l^l* 


100 




Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9. 


328 


32 


530 


AL079335 


Homo sapiens 


d\7132F21.3 (72.1 KDa protein 
(DKFZP564A032, SBBI8B) 
similar to mouse IFN-gamma 
induce MG11 . ) 


1059 


99 


531 


Y915Q6 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179. 


1159 


96 


532 


"X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


576* 


50 


533 


X76116 


caenorhabdit 
is elegans 


carrier protein (c2) 


506 " 


SO 


534 


X12966 
Y09267 J 


Homo sapiens 


3-oxoacyl-coA thiolase 
propeptide (424 AA) 


1972 


100 


535 

536 
53? 
538 


Z11773 
D84224 
D84224 


Homo sapiens 

Homo sapiens 
Komo sapiens " 
Homo sapiens " 


flavin- containing 
monooxygenase 2 

methionyl tRKA synthetase 
methionyl tRNA synthetase 


im 
2201 

4741 


100 

99 
99 


539 
540 
"541 

542 


D84224 
D84224 
J03244 

Y92514 


Homo sapiens 
Bog taurus 
Homo sapiens 


methionyl tRNA synthetase 
methionyl tRNA synthetase 
H+ ATPase 3UcDa subunit (EC 
3.6.1.3) 
Human OXRE-11. 


3887 
2933 
4529 
848 


$9 * 

96 1 

99 
77 


543 


AF221712 


Homo 
sapiens 


Smad- and 01 f- interacting 
zinc finger protein 


2301 
2151 


99 
61 


544 


AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


207 


38 


545 


A06669 


synthetic 
construct 


preTGF-betal 

i. 


2070 


99 



156 
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TABLE 2 



SEQ 
ID 

WO: 


ACCESSION 
NUMBER 


1 SPECIF*? 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


546 
547 


Y02698 
AP112205 


Homo sapiens 
f Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 
2275 


98 
100 


548 
549 

SSo 


X60271 
AC016827 


1 Mus tnusculus 

Arabidopsis 
J thaliana 


c-rel 

putative GTPase 


2264 
810 


74 
"~42 


551 


Y70400 
AB048365 


Homo — — 
j sapiens 
j Homo sapiens 


Human ceil- signalling 
protein- 2. 


429 


" 6B 


552 
553 


Y57880 
AF1198SS 


1 Homo saniens 
1 Homo sapiens 


NEDD4 - 1 ike ublcuitin ligase 1 
Human transmembrane protein 
HTMPN-4 . 
£R01B47 


8290 
1112 

265 


99 
95 

67 


554 
555 


Ml 723 6 
AL078468 


| Homo sapiens 

Arabidopsis 
) thaliana 


MHC HLA-DQ alpha precursor 
putative protein 


" 1332 
540 


100 

" TO 


556 

557 
"558 


AC006963 
AK024487 


Homo sapiens 
J Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
<PID:g46S0844) 
FLJO0O86 protein 


515 
1623 


44 
""98 


559 
560 


M12140 
X56681 


j Homo sapiens 
Homo sapiens 


pol gene protein; Xxx 
Human secreted protein 
encoded by gene 97 clone 
HAQDF73 . 
junD protein 


■"ii7 

225 


""48 

56 


561 


AF003136 


Caenorhabd.it " 
| is elegans 


contains weak similarity to 
an AMP-binding motif 


373 
2926 


88 

54 


562 


AL109839 


Homo sapiens 


dJ1069P2.3.1 {novel PABPC1 
ipoly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BcDNA,GH09817 


289 


42 


564 

565 
566 
567 
569 


AF052723 

AF161472 
Y28817 j 
U09848 j 
AF155113 J 


Feline 

leukemia 

virus 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


gag-pol precursor poiyprotein 
gPr80 

pt326_4 secreted protein. 
Aj-iiu tinger pro c em 
NY-REN- 55 antigen 


1547 

439 
3338 
173 8 
3603 


43 

44 
100 

lOo 

93 


570 
571 
572 
573 
574 


AF155113 
AL032821 | 
M69181 j 
M691B1 
Y59678 


Homo BSTii etna 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


ivr -Kt*w-bb antigen 
dJ55C23.1 (vanin 1) 
non-muscle myosin B 
non- muscle myosin B 
Secreted protein 108-008-5-0- 

BO - V U » 


3951 
1821 
7350 
7311 
772 


99 
98 
99 
98 
100 


575 


AL365234 j 


Arabidopsis 
thaliana 


putative protein ~~ 


788 


40 


576 


AL3 65234 j 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 
578 


X06745 f 
AB041642 j 


Homo sapiens 
Homo sapiens * 


DMA polymerase alpha -subunit 
(AA l - 1462) 
PAR- 6 


7619 


99 


579 


D869B4 j 
AF165124 " 4 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


1342 
2446 1 


100 
100 


580 




Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


99 


581 
582 


W88812 f 
U82319 1 


Homo sapiens 
Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 
novel ORF 


2339 
*42 


99 


h 583 
584 


P5i2i9 r 
AJ22394B | 


Homo sapiens 
(human) 
Homo sapiens 


CR1 protein. 
RNA helicase 


11425 


100 
99 


585 


Y08612 T 


Homo sapiens 


BBKDa nuclear pore complex 
protein 


6608 

mil 


99 
99 


586 
587 


mm\ 


U.J « h \ 4 ^ j f .-J | 


fcaino acid sequence of " 

Iv3l0 7. 

BAT4 


1007 
1873 


37 
9~8 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 
5B8 
"589" 


NUMBER 
AF13177S 


" Homo sapiens 


DESCRIPTION 
~ Unknown ~ 


SMITH- ' 
WATERMAN 
SCORE 
1929 


* 

IDENTITY 
99 


591 

592 
593 


" AJ25 , b865 '" 
Z98B85 

L?£S71 
AF091622 


" Homo sapiens 
~ Homo sapiens 

Homo sapiens 
~ Homo sapiens 


TESS "2 " 

dJ522J7.2 (bromodomain- 

containing 1 (similar to 
peregrin, BR140)) 
nuclear hormone receptor 
PHD finger protein 3 


2348 
4167 

1355 
9054 


100 
100 

100 
100 


594 
595 
596 

S97 


X56807 

AL137802 

AL022329 

AF226048 " 


Homo sapiens 
Homo sapiens 
Homo 
sapiens 
Homo sapiens 


desmocollin type 2a 
dJ798A10.1 (novel protein) 
bK407F11.2 (adrenergic, beta, 
receptor kinase 2) 
GL003 


4443 

212 

3653 


"100 
55 

100 ! 


598 


AJ278112 


" Homo 
sapiens] 

>Y49635 
Y4 9635 21- 
OCT- 19 9 9 15- 
APR-1998 
Human sdp3.5 
protein. 
[Homo 
sapiens 


putative cell cycle control 
protein 


2009 
335 


99 
23 


599 
600 


Y59741 
L36531 


Homo sapiens 
Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 
integrin alpha 8 subunit 


1574 


99 


601 
602 


I<30%30 

AF218584 


Homo sapiens 
Homo sapiens 


Human secreted protein 

encoded by gene No. 20. 

GGA1 ~ 


5386 
895 


99 
100 


6*03 
604 


AL132776 


Homo sapiens 
Homo sapiens 


serine /threonine protein 
kinase 

dJ393D12.1 (KIAA0776) 


3265 
5071 

"2413 


100 
99 


605 

£6£ 


Y14494 


Homo sapiens 
Homo sapiens 


OJ6B2J15.1 (novel Collagen 
triple helix repeat 
containing protein) 
aralarl 


1979 
3465 


99 
100 

99 


607 
608 


AJ001981 
X86098 


Homo sapiens 

Homo 

sapiens 


OXAlIg 

binds directly to adenovirus 
type 5 ElA protein 


2603 
"30 £"9 


100 


610 
6-11 


AFZ63572 
AF161503 


Homo sapiens 
Homo sapiens - 


Forssroan glycol ip id 

synthetase 

HSPC154 


1865 
1261 


99 
97 


612 

"613 

614 
615 
616 


L41834 
Y91954 

AL022327 

X85786 
Y08319 


Bnsis minor 

Homo sapiens 
Homo sapiens 


nuclear protein 

Human cytoskeleton associated 

protein 9 (CYSKP-9) . 

OJ355C1B.1 (KIAA0027) 

binding regulatory factor 

kinesin-2 


34* 
3££8 

361 
3203 


"30 
100 

94 
100 


617 
618 
619 


D12644 
U2B7d9 
Y35914 


1'iuu lllUSCUJLuS 

Mus musculua " 
Homo sapiens - " 


KiF2 protein 

PACT - - 
Extended human secreted 
protein sequence, SEQ ID NO. 
163. 


3487 
3609 
5936 
1684 


99 
97 
89 
99 


620 
£21 


A30463B2 


Mus musculus 


testis-abundant ringer 
protein 


199 


23 


622 


Y00062 
AF068286 


Homo sapiens 


precursor polypeptide (AA -23 

to 1120) 

HDCMD38P 


3440 
861 


99 

Too" 


623 
624 

£25 


X98248 
X61100 


Homo sapiens 
Homo sapiens ~~ 


sortilin 

75 JcDa subunit NADH 
dehydrogenase precursor 


4436 
3734 


99 
99 


626 


S58544 
AF151027 


Homo sapiens 
Homo sapiens 


75 Jtda infertility-related 

sperm protein 

HSPC193 


2125 


99 


627 

£S<V 


X149<SB 


Homo sapiens 


ail-alpha subunit (aa 1-404) 


5B2 
2079 


93 
100 




Y50911 


Homo sapiens 


Human fetal brain cdna clone 
vb7_l derived protein 


1983 


100 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 

, , 


SMiTri- 

WATERMAN 
SCORE 


IDENTITY 


629 


Y50911 


nuiiKj sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


"1694 


100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type VII 


17S4 


100 


631 


AL034555 


Homo " " 
sapiens 


dJl34019.3 (zinc finger 
protein 151 (pHZ-67)) 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


' 96 


633 


AF288288 


Homo sapiens 


HPT protein 


" 223* 


.100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 
636 


X66357 
Y11284 


Homo sapiens 
Homo sapiens 


serine/threonine protein ~ 

kinase 

AFX1 


1589 


100 


637 
638 


AR004884 
AJ0.02303 


Homo sapiens 
Homo sapiens 


PKU-aipba 
synaptogyrin lc 


2571 
"3718 


98 
99 


639 
640 


AJ002303 


Homo sapiens 
Homo sapiens 


synaptogyrin lb 
synaptogyrin lc 


1020 
1002 


100 
100 


641 

642 
643 


D87682 

M14 ccn 
X06661 


Homo sapiens 

Homo sapiens 
Homo sapiens 


similar to a c.elegans 
protein encoded in cosmid 
T26A5. 
ISG-K54 

calbindin (AA 1-261) 


933 

2&7Z 

2473 


94 

lUO 

99 


644 

645 " 


AB031048 


Homo sapiens 

Drosophila 

melanogaster 


PR02822 

microtubule associated- 
protein orbit 


1358 
"IBS 
738 


100 
"?6 

27 


646 
647 


AF250842 
X86691 


Drosophila 
melanogaster 
Homo sapiens 


multiple asters 
Mi- 2 protein 


834 


29 


648 
"*49 


067934 


Homo sapiens 


44.9 JcDa protein C18B11 
homo log 


10110 
827 


99 
96 




AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 


3830 


91 


650 


AL034553 


Homo sapiens 


UJ914P20.2 (KIAA0784 protein 

similar to Mus musculus 
act ivi ty- dependent 
neuroprotective protein 
(Adnp)> 


5708 


100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
subunit 


2388 ' 


99 


654 


AC004 614 


nomo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 


656 


Z34975 


Homo sapiens 
Homo sapiens 


Human transmembrane protein 

HTMPN-32. 

IdlCp 


608 
3733 


99 
100 


658 
"659 

660 
661 


ALOS0306 
W76734 

AF202724 
Z21966 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


dJ475B7.2 (novel protein) 
Human mDia Rho targeting 
protein . 

Sadl unc-84 domain protein 1 
mPOU homeooox protein 


1942 

781 . " 

2172 
1529 


99 
34 

ioo 

100 


662 
663 

667 


AJ242954 
AF182316 " _ 
AL161516 

X59303 


mus musculus 
Homo sapiens " 
Arabidopsis 
thaliana 
Homo sapiens 


dysferlin 
rayoferlin 

hypothetical protein 
valyl-tRNA synthetase 


47*2 
6232 
209 


59 
99 
30 


668 


Y133SS 


Homo sapiens 


Amino acid sequence of 
protein PRO220 . 


3393 
3C92 


99 
100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 
beta-N-acetylglucosaminidase 


611 


S2 


671 


X56123 


Kus musculus 


gene 
talin 






672 
673 


AB039371 
AF269223 


Homo sapiens 
Homo sapiens 


mitochondrial ABC transporter 
3 

TCPll 


4474 
2902 


76 
99 


674 

675 ' 


AF229633 
L14463 


mus musculus 
Rattus 


groucho-related protein 4 
' transducin 


806 
4053 

3619 - " 


42 
99 
92 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 






norvegicus 








€76 


ACO 05757 


Homo sapiens 


R32611 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog-pol {retroviral 
element) 


252 


65 


678 


AF271388 


Homo sapiens 


CMP -N- acetyl neuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-l 


1783 


100 


680 


AF118566 


Mus mus cuius 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y51415 


Homo 
sapiens 


Human wild type pKe83 
protein . 


2<?21 


99 


682 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb34l protein 
sequence. 


5888 


99 


684 


Y94952 


Homo sapiens 


Human secreted protein clone 
fhll6_ll protein sequence 
SEQ ID NO: 110. 


354 


98 


685 


AL021878 


Homo sapiens 


d J2 57120.4 ( t ranscr ip t ion 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 


154 


67 


666 


AE000198 


Escherichia 
coli 


orf, hypothetical protein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


99 


688 


AF039697 


Homo sapiens 


antigen NY- CO- 31 


508 


98 


689 


009355 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 B 
gamma subunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY- REN- 3 6 antigen 


265 


50 


*91 


AC004774 


Homo sapiens 


Dlx-5 


1542' 


100 


692 


X90530 


Homo sapiens 


ragB 


192<S 


99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5644. 


330 


100 


696 


AC011810 


Arab idop sis 
thaliana 


Putative methionine 
amine-peptidase 


669 


5"2 


697 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma- 1 


5364 


99 


699 


Y99401 


Homo sapiens 


Human PR01327 (UNQ687) amino 
acid sequence SEQ ID NO: 21 8. 


1386 


iod 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf- interacting 
zinc finger protein 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodul in- 1 ike 
protein, Zchral . 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml. 


1736 


99 


706 


Y41257 ■ 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 | 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


Toe -" 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08698 


Homo sapiens 


raabp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence o£ a 
human phosphorylation 
effector PHSP-2 . 


754 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 


712 


U93574 


Homo sapiens 


putative pl50 


799 


" SS9 


713 


AC004531 


Homo sapiens 


Gene with simi laity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


""48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


AL137013 


Homo sapiens 


bA311P8.3 (probable uracil 
phosphoribosyltranf erase) 


862 


100 


717 


AB035123 


Mus raus cuius 


GDI alpha/GTla alpha/GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo *P40254 
P402S4 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


B5 


719 




Homo sapiens 


integrin beta l subunit 
precursor 


4347 


99 


720 


AJ224B19 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41565 


Homo 

sapiens) 

>W41S64 

W41S64 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain. 


1591 


9d 


723 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF1B7318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB: 272876) 


1143 


"46 


726 


AC006708 


Caenorhabdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


988 


46 


727 


AC024818 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-beta repeat) , score-81.8, 
E»1.4e-20, N=3 


950 


44 1 


728 


AJ0058 97 


Homo sapiens 


JMS 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


"97 


73 0 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP -binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


862 


97 


*m - 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


§i 


734 


AC024813 


caenorhabdi t 
is elegans 


Hypothetical protein 
Y54Fl0AL.a 


152 


24 


735 


AL035461 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohoT" 
phospha tidyl trans f erase 
family member protein) 


1562 


98 


736 j 


U00033 


caenorhabdi t 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-pro tein 
transferase 1-lp; ATEl-lp 


2733 


99 
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TABLE 2 



SEQ 
ID 
NO: 


NUMBER 


SPECIES 
. 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


738 


AJ131712 


Homo sapiens 


, 

nucleolar RNA-helicase 


. 2793 


100 


739 


AJ133115 


Homo sapiens 


TSC-22-like protein 


2054 


99 


740 




nomo sapiens 


M- phase phosphoprotein 9 


953 


100 


741 


X982*8 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


7 4 '2 




Caenorhabdi t 
is elegans 


strong similarity to the YPT1 
sub- family of RAS proteins 


960 


85 


743 


X76057 


Homo sapiens 


phosphomannose i some rase 


2191 


100 


744 




Homo sapiens 


Human secreted protein, SEQ 

Xu NO: //JJO. 


496 


98 


745 


X97064 


— , 

Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


1 A Q 




Sua scrofa 


follistatin A 


1906 


98 


749 


Au249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


26 


7S0 


AL0U44X0 


Homo sapiens 


fos3 9554 1 


2094 


100 


751 


AF074968 


Homo sapiens 


P47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 

WFg 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophospha t e phospha t ase 


1375 


99 


754 


D79205 


Homo sapiens 


ribosomal protein L39 


160 


77 


755 


7\ VIA A ft /I "> A 

AB0D84 JO 


Homo sapiens 


CDEP 


142 


29 


758 


L32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


hi st one H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


7*4 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3*2* 


100 


765 


D87446 


Homo sapiens 


Similar tc a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


568 


38 


7t>o 


AL023628 


Caenorhabdi t 
is elegans 


Y17G7B.14 


200 


27 


767 


Y82777 


Homo sapiens 


Human chordin related protein 
(Clone dw665_4) . 




99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


ICQ 


Y42752 


Homo sapiens 
— 


Human calcium binding protein 
3 (CaBP-3) . 


1426 


100 


770 


X5L416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


2641 


97 


771 


/«J uu t> o y x 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 




Homo s ap l ens 


rap2 


935 


100 


773 


"Zl2l73 


Homo sapiens 


N-ace tylglucosamine- 6 - 
sulphatase 


2970 


100 


774 


YQl Q^fl 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


776 




Homo sapiens 


OJ322P7.1 (zinc finger) 


855 


5* 


777 " 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


I/O 


/"♦a i Qon 


Homo sapiens 


Human secreted protein, SEQ 


849 


98 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


dJl30E4.2 (K1AA0796) 


1321 


68 


781 


Z7*9*S 


caenorhabdi t 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJH21G12.2 (SCAN domain- 
containing l protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


13 it 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, seq 


649 


95 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








ID NO: 7954. 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


786 


Y00$18 


Homo sapiens 


Human Rab protein, RABP-l, 
protein sequence. 


1048 


""99 ■ " 


787 


297029 


Homo sapiens 


ribonuc lease HI large subunit 


1548 


99 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


94 


789 


AF024631 


Homo sapiens 


ANG2 " '" ' 


2644 


100 


790 


AJ006710 


Rattus 
norvegicus 


phosphatidylinositol 3 -kinase 


4508 


97 


792 


V00638 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


226317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


5080 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


"100 


800 




Rattus 
norvegicus 


serine -arginine- rich splicing 
regulatory protein SRRP86 


958 


63 


80X 


AF267852 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


602 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


Z81097 


Caenorhabdit 
is elegans 


Simxlarity to Human 
retinoblastoma-binding 
protein RBAP46 yk662dl2.5 
comes from this gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194. 


496 


98 


805 


AL121673 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


"icb 


806 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase. activator 
protein 


264 


3C 


806 


AB013885 


Homo sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC3 03 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon p!7 
subunit 


734 


100 


612 


274029 


caenorhabdit 
is elegans 


Similarity to c. elegans 
alcohol dehydrogenase comes 
from this gene 


6if> 


71 


813 


Z73497 


Homo sapiens 


CU240C2.2 (Core his tone 
H2A/H2B/H3/H4) 


324 


100 


814 


W87689 


Homo 
sapiens 


Human HTXFT19 polypeptide. 


1484 


99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


616 


Z92539 


Mycobacteria 
m 

tuberculosis 


pth 


300 


36 


818" 


aw n i h a a i 


Mus musculus 


B9 


197 


27 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660J2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 8032. 


700 


99 




L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


624 


Z99531 


schizosaccha 


caffeine- induced death 


184 ■ 


29 
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SEQ 
ID 

NO: 


NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


825 




romyces 
pombe 

Homo sapiens 


protein 1 

ultra high suiter Jceratin 






826 


U23037 '" 


■ oryctolagus 
cuniculus 


eIF-2Bepsilon 


693 
3406 


68 
90 


827 


G03412 
" Y30R77 


komo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


828 




Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


" Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379 . 


1012 


100 


830 
832 


W78279 
AB011542 


Homo sapiens 
Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 
MEGF9 


' 1264 


99 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


2097 
223 


100 
70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


3,574 


100 


835 


AF119664 
' APUQ^/l 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 
83 7 




Homo sapiens 
Homo sapiens 


transcriptional regulator 

protein HCNGP 

C protein (AA 1-159J 


1448 


94 


838 

859 
840 


U32865 

AF067730 
U27831 


Drosophila 
melanogaster 
Homo sapiens 
Homo sapiens 


linotte protein 
TLS-associated protein TASR-2 


918 
164 

£31 


100 
24 

56 


841 
842 


AF286366 
G02309 


Homo sapiens 
Homo sapiens 


striatum- enriched phosphatase 
CamKI-liKe protein kinase 
Human secreted protein, SEQ 
ID NO: 6390. 


2840 
1796 
278 


98 

100 

98 


843 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 
"845 


G01350 


Homo sapiens - 


Human secreted protein, SEQ 
ID NO: 5431. 


629 


100 


847 
848 


U2783 8 
Y87788 

™ AO* ( JJ^f 


Mus musculus 

Homo sapiens 
Homo sapiens 


glycosyl -phosphatidyl - 
inositol -anchored protein 
horaolog 

Human RBP-26 protein. ' 

uitt33 protein homolog 


336S 

2026 
2398 


100 
100 


849 
850 
8*1 

852 


U41315 

AF192784 

Y58$28 

"222968 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


ZNF127-Xp 
makorin 1 

Protein regulating gene 
expression PRGE-21 . 
M130 antigen 


2458 

2062 

1548 


93 
100 


853 
"854 




Homo sapiens 


M130 antigen extracellular 
variant 


6205 
6*86 


100 

166 




G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 


94 


855 
856 


G033 62 
AF285118 


Homo sapiens - " 
komo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 

CGi-203 " " 


203 j 


100 


857 


ACO06069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specifity 
factor 


452 
1383 


100 


858 


AL021546- " 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Vla-liver 
precursor (EC 1.9.3.1) 


593 


100 


859 
860 


L05956 
AF201947 


Xenopua 

laevis 

Homo sapiens 


ribonucleoprotein 
MEK binding partner 1 


1664 


85 


a4i 

862 


L31783 
AF161472 


mus musculus 
Homo sapiens 


uridine Kinase 
HSPC123 


616 

1266 


100 
$2 


863 - " 
864 


P"! 


caenorhabdit 
is elegans 


mitochondrial carrier protein 
tumor necroBis factor type l 


602 
370 

3559 


73 ■ 
43 

99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


"I SPECIES 


DES CR I PTION 
receptor associated protein 


SMITH- 
SCORE 


% 

■tWENTITY 


865 


AE001530 " 


Helicobacter 
1 pylori J99 


putative 


230 


32 


866 


XS7807 " 


Homo sapiens 


immunoglobulin lambda light 
chain 


~^99 


""91 


867 
868 


AL031673 
Y11652 


Homo sapiens 
Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 
type Zinc finger domains) 
phosphate cyclase 


4066 


99 


869 

870 
871 


AF192968 " 

AB02064 8 
AL031427 


Homo sapiens 

1 Homo sapiens 
| Homo sapiens 


high-glucose- regulated 

protein 8 

KIAA0841 protein 

d Jl 6 7A19.1 (novel protein) 


238 
3041 

3237 


100 j 
99 

S9 1 


872 
873 

074 


AL021331 


1 Homo sapiens 
Homo sapiens 

| Homo sapiens 


core histone macroH2A2.2 

CU366N23.1 (putative C". 

elegans UNC-93 (protein 1, 
C46F11.1) liJKB protein; 


1608 
' 1666 
1129 


100 j 

ioo 1 

100 


875 


AL117334 


Homo sapiens 


propionyl-CoA carboxylase 
dJ687Fll.i (novel protein 
(part of translation of cDNA 
DKFZp434N06l, Em:AL110249) ) 


3579 
306 


100 
100 


876 


X79489 


Saccharomyce 
j 9 cerevisiae 


E-925 protein 


446 


"35 H 


877 
878 


Y53001 
AF2 aiOTft 


Homo sapiens 
1 Homo sapiens 


Human secreted protein clone 
dn834_l protein sequence SEQ 
ID NO: 8. 
CHMP1.5 1 


Oil 

957 


100 i 
100 


879 
880 


vncjai 7 

A. f / 

AF001317 


i Sua scrofa 
Saccharomyce 
s cerevisiae 


4 OS ribosomal protein S12 
Soilp 


687 
478 


100 | 
28 


881 
882 


Y87275 
M14036 


Homo sapiens 
Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO j 52. 
Cl-inhibitor 


2547 


100 


883 
884 


AB041261 " 


Homo sapiens 
Mus imisculus 


calcium- independent 
phospholipase A2 
proline -rich protein 48 


598 
2903 

999 


77 j 
100 j 

84 j 


885 
886" 


Y10936 ~~ | 
AF073997 

Y57893 —f 


Homo sapiens 
Mus musculus 


hypothetical protein ^ 
myotubularin related protein ™ 
1 


1104 
866 


99 

36 H 


887 
■§88 




Homo sapiens 
Homo sapiens 


Human transmembrane protein 
HTMPN-17. 

hypothetical protein 


1099 


"94 1 


889 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


929 
2046 


99 j 
99 


890 


Y36031 
Y36031 b 


Homo sapiens 


Extended human secreted ~ 
protein sequence, SEQ ID NO. 
416. 


583 


loo 


891 






Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 
893 


AF237631 "J 
AF09092$ J 


Homo sapiens " 
Homo sapiens " 


ubiquitous tropomodulin U- 

Tmod 

PR00477p 


1798 


100 


894 
895" 


AL031228 f 


Homo sapiens " 


dJ1033B10.2 (WD40 protein 

BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


653 
3196 


99 J 
100 


89* 


AL031228 [ 
&F171102 1 


Homo sapiens 
Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
HNG10 and C. elegans F28D1.D 
retinal degeneration B beta 


2825 
1365 


96 


897 


RE003 551 

| 1 


urosophiia 
nelanogastar 


CGI 8 176 gene product i 


633 


— J 

33 | 
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SEQ 
Tn 

XL/ 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


DOR 




Homo sapiens 


DEAD Box Protein 5 


2443 


100 


B99 


Z97184 


Homo sapiens 


EKE 2 


624 


100 




Z97184 


Homo sapiens 


EKE 2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP- binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophila 
melanogaster 


CG109B4 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W84085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacfcing protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc linger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 


87 


913 


A0T243721 


Homo 
sapiens] 
>Y92508 
Y92508 13- 
APR-200 0 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP-4-keto- 6 -deoxy-D- glucose 
4-reductase 


"1716 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1/ 
Method: conceptual 
translation supplied by • 
authors 


244 

i 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


916 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
<draG) 


171 


26 


913 


M23159 


Cricetus 
cricetus 


DHFR-coamplified protein 


163 


30 


919 


L12018 


Caenorhabdit 
is elegans 


putative 


1H2 


41 


920 


AF102177 


Homo sapiens 


tumor antigen SLP-8p 


1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arab idop sis 
thaliana 


putative WD- repeat protein 


86£ 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


51 


925 




Mus thus cuius 


Fir 


1503 


95 


926 ~ 


K92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y22499 - 


Homo sapiens 


Human secreted protein 
sequence clone mh703_i. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose- 5 -phosphate- 
epimerase 


912 


100 


931 


028991 


Caenorhabdit 


coded for by c. elegans cDNA 


666 


55 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 
Is elegans 


DESCRIPTION 

cm2ic7 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


932 
933 


AL080065 
G01884 




hypothetical protein 

Human secreted protein, SEQ 

ID NO; 5965. 


210 
767 


25 
98 


934 


AJ276485 




protein 


1200 


100 


935 
936 


AZj035681 
AB026808 


Mus rausculus 


au/t>bU23.3 tnovel protein 
similar to drosophila 
transcriptional repressor) 
synaptotagmin XI 


1142 


80 


93 7 


AB015345 


Homo sapiens 


HRIHFB2216 


2142 ™ 
2601 


" _ 9S 
' "99 


938 


X65724 


Homo sapiens 


0RF2 


498 


'"100 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


1487 


inn 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128 . 


117 


"~ioo 


941 


AF094583 


Homo sapiens 


putative Hiv-i infection 
related protein 


452 


100 


942 
943 


AC024200 
AF129756 


Caenorhabdit 
j. s elegans 

Homo sapiens 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 

G5C 


350 


'69 


944 


K23765 


norvegicus 


alpha- tropomyosin 


273 
133 


100 
96 


945 
946 

947 _ 


AF055473" 


Arabidopsis 
t ha liana 
Homo sapiens 
Homo sapiens 


Contains similarity to 

AD021 protein 
GAGE -8 


583 

551 
273 


47 

44 
51 


948 
949 

OCA 


X7S75^ 

AF143956 

Y36729 


Homo sapiens 
Mus mus cuius 
Homo 
sapiens 


protein Jcinase C mu 
coronin-2 

Human PG1 protein sequence. 


2019 
2300 
1861 


68 
93 
93 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein I*BP-2. 


282 


67 


952 


AB016B81 


Arabidopsis 
thaliana 


gene_id:MXC17.7~ 


203 


4* 


953 


vni Toe 


Homo sapiens 


Human uhiqui tin- conjugating 
enzyme >Y25341 Y25341 01 -JUL- 
1999 12 -AUG- 199 8 Human NCE-2 
protein. 


3*S 


100 


954 
955 


AF145615 
U09410 


Drosophila 
melanogaster 
Homo sapiens 


BCDNA.GH03377 

zinc finger protein ZNF131 


823 
2483 


46 
99 


956 
957 


U09410 
AF195623 


Homo sapiens 
Homo sapiens 


zinc finger protein 2NF131 
cholinephosphotransf erase 1 
alpha 


1853 
2126 


99 
99 


958 


X94917 


Drosophila 


head-elevated expression in 


155 


32 


959 
960 


U54807 
AF05B807 


Rattus 
Bos taurus 


GTP-oinding protein 

GTP -binding protein rah ~ " "' 


1167 


97 


961 
962 


G03244 

• V 9 DO JU 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


606 
471 


97 
100 


963 


AP001754 " 


Homo sapiens 


steroid dehydrogenase homolog 
transient receptor potential- 
related channel 7 # a novel 
putative Ca2+ channel protein 


583 
317 


40 
30 


964 


AL035419 


nomo sapiens 


daii00H13.1 {putative novel 
protein) 


1129 


100 ~" 


965 


X61361 


Rattus 
rattus 


interferon -induced protein 


202 


46 


966 
967 


D3B169 


Homo 
sapiens 


inositol 1,4,5-trisphosphate 
3 -kinase isoenzyme 


3278 


100 




AL031432 " 


Homo 
sapiens 


OJ465N24.2.1 (PUTATIVE novel 
protein) (isoform 1) 


893 


100 ^ 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
v*v nt5 Eiti, 


SPECIES 



DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


611 


100 


969 


" AJ0L1306 


Homo 

can't am# 


guanine nucleotide exchange 
factor (long isoform) 


2752 


99 


970 


" AF281134 — 


Homo sapiens 


exosome component Rrp46 


1186 


100 


971 


U53336" 


caenorhabdit 
is elegans 


weak similarity over a short 
region to myosin heavy chain 


""536 


23 


972 


ACfl 1 R7d <1 

A\» U — O / It J* 


jjeisnmania 
major 


IiB840. 12 


589 


53 


973 


X O O J ut 


PIUS uiUSCUXUS 


JjNV 


544 


85 


974 




Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 
1 


hunt ingt in- interacting 
protein HYPA/FBP11 


1390 


97 


976 


«r 1D1DJU 


Homo sapiens 


HSPC182 


1040 


100 


977 


" G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


_ "626 


100 


978 


AF164797 


Homo sapiens 


nbosomal protein L17 isolog 


908 


1G0 


979 


U94991 


Xenopus 
laevis 


transcription factor XLM01 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calseques trine 


2029 


100 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X6502Q 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodococcus 
sp. AD45 


putative raceraase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 


AB030835 


Homo sapiens 


contains two glutamine rich 
domains, three zinc -finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR- interacting protein-1 


1262 


38 


988 


AL022238 


Homo sapiens 


dJ1042Kio.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


4048 


99 




AJj02223B 


Homo sapiens 


dJ1042K10.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


2321 


99 


qon 


At 16 1425 


Homo sapiens 


HSPC308 


448 


92 


991 
"~q-Q*> 


AF161426 


Homo sapiens 


HSPC308 


446 


92 




AK161426 


Homo sapiens 


HSPC308 


453 


92 


993 


AL.023859 


Schizosaccha 

romyces 

pombe 


trna-splicing endonuclease 
subunit 


172 


42 






Homo sapiens 


(U513M9.1 (novel Homeobox 
domain protein) 


241 


47 


995 


AC005253"' 


Homo sapiens 


R26445_l 


902 


100 


996 


AF265206 


Homo sapiens 


MOG1 isotorm A 


974 


100 


997 


AJ24828S 


pyrococcus 
abyss i 


sarcosme oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE003641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product 


218 


58 


999 


W69343 


Homo 
sapiens 


Secreted protein ot 1 clone 
CR930 1. 


1340 


98 


i nnn 


AIO07135 


Homo sapiens 


similar to bovine adp/atp 
translocase Tl mRNA with 
GenBank AcfSAflfli nn Mnmhor 
M24X02.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence. 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


"1003 


AB004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 '■■ 


35 


1004 


AL031431 


Homo sapiens 


<U462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 
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SEQ 

JLU 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






Cams 

farailiarie 


centractin 


1315 


98 


1007 




mus 

mus cuius 


chaperomn containing TCP-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
Is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


GQ2841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH1 0333 


1244 


52 


1015 


Y02860 


Homo aapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


"97 


101* 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO: 3 74. 


"2323 


100 




X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule-associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


Bex- regulated protein j anus- a 


674 


100 


1021 


AF19062S 


Coturnix 
coturnix 


qclgll-l 


638 


96 


1022 


AL133363 


Arabldopsis 
thaliana 


putative protein 


155 


$1 


1023 


AB034912 


Homo sapiens 


WD- repeat lake sequence 


2483 


100 


1024 


AY007091 


Homo eapiena 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


2243 


100 


1 025 


X69910 


Homo sapiens 


P63 protein 


2958 


9$ 


1026" 


U8Q736 


Homo sapiens 


CAGF9 


1657 


100 


i nil 


ABD29333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


104S 


100 




fini inn 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 




rsn 1 ion 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 


<teo 


100 


103^2 


AJ222968 


Mus musculue 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 




Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


99 


103 5 


XJ276004 


Nus rausculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 j 


Homo 
sapiens 


Human membrane protein 
BA0306. 


1921 


97 


1039 


U88173 


caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiqui tin- like protein 8 


331 


80 
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SEQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1040 




Homo sapiens 


blood group carrier molecule 
DOK1 


1637 


99 


1641 


Y96730 


Homo 
sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 




Mus musculus 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


US PCI 89 


1104 


100 


1044 


AF181631 


Drosophila 
me 1 anoga s t e r 


BcDNA . GH0 4929 


204 


37 




Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


194 0 


100 


1046 


Au2439 /2 


Homo sapiens 


ft -pnospnogiuconoiactonase 


1317 


100 


XU4 / 


AdUJsoo3 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Homo sapiens 


o.tJll84F4.2 (novel protein 
similar to nucleolar protein 
4 (N0L4) (NOLP)) 


981 


92 


104 9 




Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein L30 
isolog 


B68 


100 


1051 


AF190624 


Mus musculus 


mdgl-l ~ 


236 


85 


1052 " 


AE003529 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 — 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 • 


1054 


AL162756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

ami dot ransf erase subunit A 


682 


44 


1055 


AF1B1856 


aattus 
norvegicus 


tRNA eelenocysteine 
associated protein 


1525 


99 


1056 


U89649 


Chlaraydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratmocyte annexin-like 
protein pemphaxin 


1710 


99 


JL u by 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1050 


AF224263 


Heterodontus 
f rancisci 


HOXDB 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


Streptomycea 
coeli color 
A3 (2) 


hypothetical protein 

♦ 


143 


27 


J.VOJ 




Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10) . 


2547 


100 


1064 




Homo sapiens 


acetyl-CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 <PIDjg2984292) 


£62 


98 


1667 


Y18930 


Sulfolobus ~ 
solf ataricus 


hypothetical protein 


162 


29- 


1068 


R65969 


Homo 

sapiens T98G 


Glioblastoma -derived 
polypeptide. 


887 


100 


1069 


XU / ?b» 


Homo sapiens 


Human secreted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


"8* 


1071 


AF245505 


Homo sapiens 


ad Heart 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase it, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


69B 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Komo sapiens 


Amino acid sequence of 


1271 | 91 
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SEQ 

TT*i 

NO: 


ACCESSION 
NUMBER 


SPECIES 


F DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








protein PR0328. 






1076 


AF1614S7 


Homo sapiens 


HSPC339 


571 


100 " 


■ i077 




Homo sapiens 


Human carbohydrate- associated 
protein CRBAP-5 . 


2151 


98 


1078 




Homo sapiens 


HT015 protein 


831 


££ 


1079 


AL13296S 


Arabidopsis 


putative WD-40 repeat -protein 


286 


29 


1080 


/wv«i/4 # 


riouio Sapiens 


LiUnA 


1284 


100 


1081 


Y1476B 


Homo sapiens 


V-ATPase G-subunit like 
protein 


579 


100 


1032 


AF016416 


Caenorhabdit 
is elegans 


F29A7.4 gene product 


141 


31 


1083 


L13 291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


Mus mueculus 


unnamed protein product 


151 


44 ■ 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


262 


97 


1086 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence o£ a 
human RNA-associated 
protein. 


"2783 


100 


1089 


Y9486 , 7 


Homo 
sapiens 


Human protein clone HP10563. 


613 


100 


1090 


AK023982 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


U34 973 


Mus musculus 


protein tyrosine phosphatase - 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


522 


' 


1095 


YB7276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


1029 


99 


iuyo 


YB7276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


98 




Af JLd145d 


Homo sapiens 


HSPC337 


742 


98 


1098 


U86029 


Caenorhabdit 
is elegans 


similar to thioredoxin 


242 


39 


1 ft Q Q 


s.Tnnc net: 


Homo sapiens 


Sqv-7-lifce protein 


1321 


99 


1100 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


"1118 


99 


1101 


AJ00586S 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1016 


99 


1103 


ALill0244 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 


Drosophila 
malanogaster 


braJceless-B 


147 


52 


1105 


"AL031010 " 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.S) 


968 


100 


xx uo 


IMbQlb 


Mus musculus 


parathion hydrolase 
(phosphotri esterase) -related 
protein 


1624 


87 


1107 


s.inoi crt 
Auz /HJ.bU 


Homo sapiens 


putative lipid kinase 


2207 


99 


1166 ■"■ 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814 . 


495 


98 - 


1109 


AF217287 


melanogaster 


w piULcin tuiooio 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076" 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


604039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 
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SEQ 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 








ID NO: 6l20. 






1115 


AF224439 


Mus musculus 


zinc finger protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo sapiens 


thyroid receptor interactor 


404 


85 


1118 


A12155 


Homo sapiens 


Human X5L cDNA. 


1673 


100 


1119 


AIj161542 


Arabidopsis 
thaliana 


isomerase like protein 


607 


£3 


1120 


AL023754 


Homo sapiens 


dJ272L16.1 {Rat 
Ca2+/Calmodulin dependent 
Protein Xinase LIKE protein) 


2341 


98 


1121 


ib/901 


Homo sapiens 


Human transmembrane protein 
ETMPN-25. 


321 


36 




Z14 122 


Xenopu3 
laevis 


XLCL2 


455 


77 


1123 


AF22541B 


Homo sapiens 


lipase 


1*31 


97 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035690 


Homo sapiens 


dJ202I2l.l (novel protein) 


952 


100 


1126 


AJ000217 


Homo sapiens 


CLIC2 


1286 


59 


1127 


AB030505 


Mus musculus 


UBE-lc2 


1069 


79 


1128 


Y73*75" 


Homo sapiens 


HTRM clone 1427838 protein 
sequence . 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophilin-type pep t idyl 
prolyl cis/trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL023553 


Homo sapiens 


dJ347H13.4 (novel protein) 


557 


100 


1131 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1408 


100 


1132 


Z68197 


Scbizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z681$7 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus nmsculus 


enhancer of polycomb 


"2^4 


4i 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


9S> 


1137 


AJ006219 


Drosophila 
melanogaster 


clathrin-associated protein 


1254 


78 


1136 


Y7621B 


Homo sapiens 


Human secreted protein 
encoded by gene 95. 


440 


98 


1139 


W88104 


Homo 
sapiens 


A Rab protein designated 
HRABS-2. 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product. 


3*09 


100 


1142 


Y13402 


Homo sapiens 


Amino acid sequence of 
protein PR0310 . 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


109£ 


ioo 


1146 




Homo sapiens 


SPIN tSPINDLIN HOMOLOG 
{PROTEIN DXF34) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34 ) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


93 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150' 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


5S 
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SCORE 


% 

IDENTITY 








HEAAR60. 






1151 


AF044201 


RattuB 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF1S6774 


Homo 
eapiens 


lysophosphatidic acid 
acyltransf erase- gamma 1 


iaSS 


99 


1153 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, BTO:AL050069) ) 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence. 


1381 


97 | 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: B117. 


607 


99 


1157 


AF1 12444 


Lupinus 
luteus 


L-asparaginase 


28* 


43 


1158 


AF151848 


Homo sapiens 


CGI- 90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 

SEQ ID NO: 107. 


746 


83 


1163 


AF113534 


Homo sapiens 


^P1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl ~ 


191 


41 


1165 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


CU1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


945 


76 


1167 


AF1B7733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phosphoi'Tpase" 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


YoikM 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L03188 


Saccharorayce 
s cerevisiae 


putative 


180 


22 


1172 


AF113751 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJl042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


"28 


1176 


M35617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL09476-* 


Homo sapiens 


^SJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
Is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1046 


97 


1182 


X82240 


Homo 
sapiens J 
>R94974 
R94 974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide . 


T cell leukemia/lymphoma 1 


617 


100 
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SEQ 
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SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






[Homo 
sapiens 








1183 


U42B41 


Caenorhabdit 
is elegans 


short region of 1 weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth- associated protein 


130 


"36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


' 60 


1189 


AL136307 


Homo sapiens 


~3J380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32B28 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


266 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-l 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF026530 


Rattus 
norvegicus 


stathmin-like-protein splice 
variant RB3 • 1 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homo log r-vps33a 


2981 


9* 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PR63 protein. 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD- 017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to s. 
pombe phosphatidyl synthase 
(GB:Z28295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high- sulfur keratin 


484 


82 


1202 


Z85986 


Homo sapiens 


dJl08Kll.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


MUS OTUSCUlUS 


jerky 


2235^ 


7<J 


1205 


AB002327 


Homo sapiens 


KiAA03i9 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ubiquinone/menaquinone 

biosynthesis 

methyl transf erase-like 


762 


56 ; 


1207 


AL136307 


Homo sapiens 


dOT380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


AF2079B3 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


dJ466Nl.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G))> 


181 


44 


1210 


U21549 


Mus mus cuius 


Ac3 9/physophilin 


1280 


68 


1211 


Y27 700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814 


mus musculus 


odd- skipped related 1 protein 


945' ' 1 




1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14649 


Mus musculus 


meiosis- specif ic nuclear 
structural protein 1 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


5-90 


100 


1216 


Z72510 


caenorhabdit 


similarity to yeast UTR3 


634 


49 
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SMITH - 
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SCORE 


% 

IDENTITY 






Is elegans 


protein (Swiss Prot accession 
yk677hll.5 comes from this 
gene 






1217 


Z49703 


Saccharomyce 
s cerevisiae 


"unknown 


134 


22 


1218 


AC013430 


Arabidopsis 
thaliana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


58 


1221 


AL163 815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 


Homo sapiens 


zinc finger protein KY-REN-21 
antigen 


2261 


100 


1223 


J05071 


Bos taurus 


GTP- binding regulatory 
protein gamma -6 subunit 


356 


100 


1224 


Y73364 


Homo sapiens 


HTRM Clone 27G^Q91 nrnf n 

sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 . 


X64002 


Homo sapiens 


RAP74 


2661 


99 


1227 


X04085 


Homo sapiens 


catalase 


2846 


100 


1228 


AJ005620 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 [ 


122S 


AF045564 


Rattus 
norvegicus 


development- related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 
1232 


L08239 
AF121863 


Homo sapiens 
Homo sapiens 


located at OATLl 
sorting nexin 14 


2274 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 
1203 


100 
84 "■" 


1234 


AC024805 


Caenorhabdit 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyce s cerevisiae 
probable membrane protein 
YLR418C <GB:U20162) 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage actin-associated- 

tyrosine-phosphorylated 

protein 


1559 


S7 


1237 


"AB042646 


Homo sapiens 


TGIP2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT | 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


600429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein I 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483" 


Gallus 
gallue 


Yes-associated protein 
(SSkDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


ido ' 


1246 


AJ276003 | 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


1248 


ALUUSH /*» 


Homo sapiens 


similar to N- 

acetylgalactosaminyl transfers 
se; similar to Q07537 
(PlD:gll71989) 


9S7 


100 


1249 


AP199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein 1 


1139 


100 


1250 


Y1314B 


Rattus 
norvegicus 


P AGS 06 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron-specific protein PEP- 
19 


124 


46 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1252 


AF146738 


RattUS 
norvegicus 




^71 


83 


1253 


G02725 


Homo sapiens 


Human fte* crr>\'t^ri r\rAl>Ajn oca 

ID NO: 68C6. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC4119SJL 


831 


78 


1256 


AB004316 




uucuLjimiunai meL Alonyl - tRNA 

t rans formyl as e 


1556 


88 


1257 


Z35094 


Homo sapiens 


SURF- 2 


"1354 


97 "■ 


1258 


Y13362 


Homo QP»T"> S one 


Amino acid sequence o£ 
protein PR0214. 


2383 


100 


1259 


AC006014 


Homo RanSemo 


similar to RFP transforming 

pivcEin, Similar CO irj.4373 
t P TTl -rrl "aoci •» \ 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


' 4G9 


100 


"1261 


V00507 


Homfi eani one 


coding sequence of DHFR (1 is 
1st base in codon) (561 is 
jj.a case in codon) 


984 


100 


1262 


X15443 ~ 


XVCL L> LUb &p . 


gamma- glutamyl transpeptidase 
(AA 1-568) 


697 


32 


1263 


AP173871 


UUS UtUbUUXUS 


neuronal PAS 3 


977 


" 94 


"12^4 


AP178983 " 


flomo Boipxsno 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein- 1 (CNAP- 


2785 


99 


1266 


Y41738 


sapiens 


fiauiaxi fjcuMi protein 


1622 


100 


■"12*? 


AF061346 


Mus muoculus 


Edpl protein — 


1077 


64 


1268 


U97006 


Caenorhabdi t 
is elegans 


w-ijciu.4 gene product 


154 


23 


1269 


AF2335B2 


Mus musculus 


GTPase Kab37 


942" 


9* 


1270 


AF19^951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


HoniO flBni »nQ 


□uoojnia. j vnovei protein) 


1150 


55 


1272 


AP201933 






650 


100 


1273 


AF201933 [ 


Homo sapiens 


DC11 


346 


98 


1274 


AIi02171O 


thaliazia 


putative procein 


348 


49 


1275 


AC004449 


Homo sapiens 


R33683 3 


556 


100 


1276 


Y86295 




HL2AG87, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


uuiuaii njf w*x»JAa&e protein*? 

(HYDRL-9) . 


1576 


99 


1278 


SS4421 


Homo sapiens 


T cell receptor eta-exon 


478 


100 


1279 


Y66695 


Homo T 
sapiens 


PR01344 . 


1909 


100 


1280 * 


AF161380 


Homo sapiens 


3«SPC262 ■"■ 


772 


100 


1281 


Y48610 ~ 


Homo sapiens 


Human br^Aflt" t-Timr»n**~ 

imiimil M&Cttob UUIIIUUJl 

associated protein 71. 


779 


100 


"1282 


AC015446 


thaliana 




406 


35 


1283 


AK024432 






403 


35 


1284 


^9t»l53 


Homo sapiens 


Human FADD- interacting 


1825 


81 


1285 


^1001019 


Homo sapiens 


ring finger protein 


1301 


ioo 


1286 


AE0C3B23 


Drosophila 
melanogaster 


CG13178 gene product 


195 
- 


29 


1287 


"SFT78632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC0Q6033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 <PID:g2135214) 


1195 


100 


1289 


AC00£033 


homo 
sapiens 


similar to MTM 64; similar to 
138027 (PID:g213S214> 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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TABLE 2 



SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


5MITH- 
WATBRMAN 
SCORE 


IDENTITY 


1291 


273424 


is elegans 


C44B9.1 


235 1 


36 


1292 


Y94871 


sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AP190425 


.luino sapiens 


retinoblastoma-associated 
protein RAP140 


489 


29 


1294 
1295 


G03B56 
AP133670 


Homo sapiens 

- . 

Mus mus cuius 


Human secreted protein, SEQ 
ID NO: 7937. 

ARL-6 interacting protein-2 


538 
367 


99 
51 


T296 
1297 


AJ249735 
X57560 


Homo sapiens 

Escherichia 

coli 


claudin-6 
pspE protein 


il42 

53g 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine-rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
yki09h8.5 


324 


73 


1300 


AB024523 


Homo sapiens 


basic Jcruppel like factor 


1206 


100 


13 01 
13 02 


YCrnnn 

/if uu / lox 


Homo sapiens 
Homo sapiens 


eosinophil cat ionic- related 

protein 

unknown 


737 


99 


"1303 


X52904 


•Escherichia 
coli. 


open reading frame (AA 1-^5) 


1481 
359 


100 

100 


1364 
1305 


U19577 
AF266508 


Escherichia 
coli 

Mus musculus 


galactonate dehydratase 
NELF protein 


242 








Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


1409 
$32 


97 
"100 




Uoo /DO 


Caenorhabdit 
io elegans 


similar to the mitochondrial 
carrier family 


365 


54 


lino 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 
1310 


AL078593 


Homo sapiens 
Homo sapiens 


dJ210Bl.l (KIAA06501 

E48 antigen 


267 
6*20 


34 | 


1311 


Z82263 


Caenorhabdit 
is elegans 


C47A4.1 


263 


96 
35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 


Y41763 ■ 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


1^36 


1D0 


1314 


AF196972 


Homo sapiens 


JM24 protein ~ 


2239 


100 1 


"1315 


AF053356 ' 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


Q "7 '1 
I 


1316 




Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1969 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 I 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 J 


1319 
1320 "■ 


AF153127 
XS6932 


Gallus 
gallus 

Homo sapiens 


SAPK interacting protein 
23 kD highly basic protein 


1651 


86 "I 




AF174 605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F~ 
box protein 
FBP-18. 
THomo 
sapiens 


F-box protein Fbx25 


1044 
467 


100 ] 
70 


1322 


M*1732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 J 
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SEQ 
ID 

NO: 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION" ' "" " 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






retrovirus 








1324 


AL138655 


Arabidopsis 
thaliana 


putative protein 


1174 


37 


132S 


ALU 8655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


1326 


AL133215 


Homo sapiens 


DA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 * 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 
1332 :'" " 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide. 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc- finger protein ZBRK1 


411 


91 


1334 


Z82271 


caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KI?4 comes from 
this gene 


578 


44 


1335 


AE000810 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


290 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus mus cuius 


protein phosphatase 


378 


84 


1338 


U64856 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-ll protein 


2 04 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 68398- 

67881 


289 


45 


1342 


^27*171 


Homo sapiens 


ASPIC 


2122 


100 


'1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 

<PID:g4650844) 


894 


' IS 


1345 


AF2*74« 


Homo sapiens 


N-acetylneuraminic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


A*272073 


Torpedo 
marmorata 


male sterility protein 2 -like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
H0SBI96. 


1117 


100 


1351 


602144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 ™ * 


413 


100 


1354 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1 CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


13^0 


AC074331 


Homo sapiens 


ZNF234 


3869 


100 


1361 


AL163279 


Homo sapiens 


homo Log to cAMP response 


503* 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBBR 


SPECIES 


bESCREPTIOtf 


SMiTH- 
WATERMAN 
SCORE 


% 

IDENTITY 








element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z43475 


Homo sapiens 


glucokinase regulator 


2682 


" 97 


1364 


AF195^64 


Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGTl 
protein 


20SS 


99 


1365 


AF116609 


Homo sapiens 


PR00915 


581 


100 


1366 


AF116609 


Homo sapiens 


PR00915 


581 


100 


1367 


AL117352 


Homo sapiens 


dj876B10.3 (novel protein 
similar to C, elegans 
T19B10.6 {Tr:Q22557)> 


2581 


„- 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5. 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (aa - 
25 to 1018) (3416 is 2nd base 
in codon) 


5908 


99 


1372 


ZSB048 


Homo sapiens 


OJ408N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


1375 


U53445 


Homo sapiens 


D0C1 


1645 


46 


1376 


AL117337 


Homo 
sapiens 


OA393J16.1 (sine finger 
protein 33a (KOX 31) ) 


250 


60 


1377 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 


1378 


U35113 


Homo sapiens 


metastasis-associated gene 


1823 


69 


1379 


L15313 


Caenorhabdit 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKHZtf 


959 


97 


1383 


AF237676 


Mus mus cuius 


\Q beta- like protein GBL 


1721 


r"96 " 


1384 


AF237676 


Mus musculus 


G beta -like protein GBL 


1043 


70 " ' 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaREG-l. 


715 


"166 


1386 


AF212162 


Homo sapiens 


nine in 


10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


13 88 


AC004890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA24380 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 • 


AC035150 ; 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X9O840 


Homo sapiens 


axonal transporter of 1 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3206 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


1396 


ACO04809 


Arabidopsis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJ1068H6.4 (prion protein 
like protein doppel) 


962 


iob 


1400 




Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


P1.11659_5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 




21 
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SEQ 
ID 
NO: 


ACCESS ION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1403 




Homo 
sapiens 


Human transferase TRNSFS-14. 


2842 


100 


1404 


X81058 


Mus musculus 


tex261 


1010 


99 


1405 


-H-tl u J. ^ u D *i 


Mus musculus 


ITM 


194 


29 


14 06 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 




AJU1058 5 


Rattus 
rattus 


PTB-like protein 


2684 


99 


t Art 0 


X75760 


Drosophila 
melanogaster 


LRR47 


364 


29 




U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


AC005578 


Homo sapiens 


F20887_l, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein. 


360 


100 


1412 


X01563 


Escherichia 
coli 


Ls (rplE) laa 1-1?9) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


ioo 


1416 


AF097994 


Homo 
sapiens 


L- Jcynurenine/alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
aura t us 


guanine nucleotide -binding 
protein b^ta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


bA465L10.5 (KIAA117<i (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


152 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qs 14_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 


10748 


5*9 " 


1424 


Y48517 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF20884 8 


Homo sapiens 


BM-006 


853 


79 


1427 


AF112886 


Bos taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161534 


Homo sapiens 


HSPC049 


2853 


7^8 


1430 


AF125043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 


30 


1431 


Y66718 


sapiens 


Membrane-bound protein 
PR01105. 


1686 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


192 


34 


1434 


R99900 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 


1435 


AF220530 




myo- inositol i-pnospnate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


Bridging mtegrator-3 


1282 


100 


14 3 a 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1. 


*95 


9tf" '— " 


1439 


AJ293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long isoform 


3083 


100 


1441 


AF21913 8 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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SEQ 
XD 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


' SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1442 


AB039669 


Homo sapiens 


ALKX3 


1944 


100 


1443 


AF237711 


Drosophila 
melanogaster 


Diablo 


191 


27 


1444 


AJ011096 


Homo sapiens 


Naf 1 beta protein 


439 


39 


144B 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


" 9d 


1446 


AF214114 


Homo sapiens 


breast carcinoma- associated 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapienp 


ANC 2H01 


2645 


99 


1448 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY-REN-50 antigen 


1184 


69 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SEQ ID NO: 48. 


985 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


68B 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2-bindlng protein 


4*6 


78 


1453 
1454 


Z38011 


Mus mus cuius 


~DMR-N9 


882 


56 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT©EMBL- Heidelberg .DE 


510 


28 


1455 


AL035409 


Homo sapiens 


<to564Mll.3 {similar to 
Bialyltranf erase) 


1356 


100 


1456 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/D:CE1 


478 


45 


1459 


AF242552 


Gall us 
gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


14611 


AB02S2*8 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID:g573097) , R19699 
(NID:g774333) 


869 


98 


1464 


A£004997 


Homo sapiens 


match to ESTs 243979 
(NID:gS73097) , R19699 
(NID:g774333) 


869 


98 


14^ 


U32743 


Haemophilus 

influenzae 

Rd 


fucose operon protein (fucU) 


315 


50 


1466 


Y09022 


Homo sapiens 


Not56-like protein 


2342 


100 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 
( 


ribulose-1, 5-bisphosphate 
carboxylase /oxygenase small 
subunit N-methyltransferase I 


333 


"2<5 


1469 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN-54. 


ICS'S 


100 


1470 


AF032666 


Rattus 
norvegicus 


rsec5 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein-17 (MECHP-17) . 


4*2 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 1 


Homo sapiens 


genethonin 3 


4026 


_ 98 


1474 


S45936 


Homo sapiens 


Htsi — ■ 


1101 


50 


1475 


¥8*241 


Homo sapiens 


Human secreted protein 
HOABR60 , SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded. for by C, elegans cDNA 
yk99b4.3/ similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


U10536 


Pan paniscus 


MriC. class I A 


67S 


84 
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ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1481 


AL078599 


Homo sapiens 


dJ99lC6.1 (novel protein 
similar to C. elegans 
FS5A12.9 (Tr:P91086)) 


1274 


65 


1482 


Z98977 


Schizosaccha 
pombe 


putative vacuolar protein 


256 


29 


1483 


AB005662 


Mus musculus 


JNK/SAPK-associated protein-1 


4968 


92 


1484 


AL050120 


noma sapiens 


hypothetical protein 


716 


100 


1485 


M27878 


Homo sapiens 


DMA binding protein 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 


575 


99 


1487 


- X8415g 


Saccharomyce 
s cerevisiae 


Aim 


341 


29 


1488 




Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk30b3.5; coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AEO0Q9B9 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydra tase {fad- 4) 


533 


46 


1491 




Rattus 
norvegicus 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 

sequence . 


3 513 


99 


""1493 




Homo sapiens 



Human secreted protein (clone 
f j283-ll) . 


462 


37 


1494 




Mus musculus 


ARL-6 interacting protein-2 


701 


97 


149S ™ 


' Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


1371 


100 


1496 


.M-UUf* JOSS 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 




Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophilum 


putative target YPIj207w of 
the HAP2 transcriptional 
complex related protein 


269 


35 


1499 


AB03994'7 


— J— - ■ 

Homo sapiens 


-X11L- binding protein 51 


227 


36 


1500 




Homo s ap i e ns 


UBASH3A protein 


3 509 


100 


1501 


nuwj w J _> J 


nomo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) } 


2439 


100 


1502 




Homo sapiens 


TALE homeobox protein Meis2b 


ii4o 


100 


1503 


AF178948 " 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


1504 




Homo sapiens 


Human secreted protein clone 
pn749_8 protein sequence SEQ 
ID NO: 16 . 


1442 


99 


1505 


X82494 


Homo sapiens 


ribulin-2 


5580 


99 


1506" 




Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


AL034548 


Homo sapiens 


dJ1103G7.6 (novel protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF220182 




uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 


"tte4$01 


is elegans 


Gene probably begins in the 
next cosraid 


415 ! 


58 


1511 


AL3 56192 


crassa 


related to MDMi protein 


196 


29 


1512 " 
1513 


D17629 


sapiens 
tiutuo sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 
x 009 protein 


1829 


100 


1514 


AJ243531 j 


Homo sapiens 


nMl5 protein 


694 

"735 


99 

100 } 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc"" 
finger protein 


407 


30 


1516 
"1517 - 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 1 " " 


90 ' 




AF003140 


caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


"1518 


AB002584 


Rattus 
norvegicus 


be ta - alanine - pyruvate 
aminotransferase 


2238 


82 


"1513 


AL121764 


Schizosaccha 


yeast atpl2 protein precursor 


270 


30 
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XD 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATSRMAN 
SCORE 


IDENTITY 






romyces 


homolog 






1520 


AF255910 




vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


sapiens 


Membrane- bound protein 
PRO190 . 


985 


100 


1523 


■"Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


thaliana 




277 


37 


1525 


AF109377 


Mus musculus 


IdlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase- like 
phosphodiesterase 


1496 


79 


1528 




Homo sapiens 


FL«J00012 protein 


611 


100 


1529 


AF154502 


Homo sapiens 

- 


quiescent cell proline 
dipeptidaoe 


679 


100 


1530 




Homo sapiens 


transposase-like protein 


1368 


100 


1531 




Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 




Ar 03902 J 


Homo sapiens 


Ran-OTP binding protein; 
RanBP6 


5707 


99 






Arab! dop sis 
thaliana 


F23N19.9 


3 74 


Ti 


1535 


AB027564 


Homo sapiens 


DINB1 


44B2 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_i derived protein. 


3693 


99 


1538 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 




no^e ice 


Homo sapiens 


sphingosine kinase 


2011 


99 


1540 




Homo sapiens 


0A1 


2238 


100 


-LO'i J. 




Caenorhabai t 
is elegans 


Contains similarity to Pfam 
domain: PF0O169 (PH) , ! 
Score»20.6, E-value=l . 9e-05, 


379 


42 


1542 


Y71159 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 

iny omega iin. 


9415 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 i 


1544 


AB015330 


Urtrurt rati i on a 




631 


50 


"1545 


AF198487 


Homo sapiens 


transcription factor LBP-lb 


2822 


100 


154^ 


AF016417 


is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


not; 


100 


1548 




va iol S 5 luy 


ubiqui tin- activating enzyme 

171 


836 


42 


1549 


AL021707 


Homo sapiens 


dJ508I15.4 (KIAA0668) 


3688 


100 


1550 




OdCJLllUS 

subtilis 


YvqK protein 


292 


42 


xssi" " 


AF14^K1 5 


uruaupriiXa 

melanogaster 


BcDNA.GH03377 


822 


44 




ALjXo / 


Schizosaccha 

romyces 

pombe 


putative mannosyl transferase 

involved in M-alvcosvl a i on 


435 


37 


1553 


AF079S27 


Mus musculus 


XERii 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


68 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 

ISMO-3. 


1780 


99 


1556 


AF116553 


Urosophiia 
melanogaster 


antennal-specif ic short-chain 
dehydrogenase/reductase 


277 


32 


1557 


Y71056 


Homo sapiens 


Human membrane transport 


1975 


99 
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ID 
NO: 


ACCESSION 
NUMBER 


; SPECIES 


j DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


t 

IDENTITY 








protein, MTRP-1. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1 . 


1894 


97 


1560 


AF092050 


Mus mus cuius 


beta-l,3-N- 

acetylglucosaminyltransf erase 


262 


44 


1561 


AL109827 


Homo sapiens 


dJ3 09K20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 




dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


301S 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216_l 


919 


82 


1566 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, B-value=l , 9e-05 f 
N=l 


550 


45 


156-7 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


1568 


D49473 


Mus mus cuius 


truncated form of Soxl7 


1047 


78 


1569 


AK02527O 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mu 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCWlP-l 


2388 


100 


1572 


AE003831 


Drosophila 
melaxiogaster 


CG18445 gene product 


180 


31 


1573 


AF074603 


Streptomyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdit 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


166 


1577 


AF237711 


Drosophila 
melanogaster 


biaJblo 


421 


54 ! 


1S7"8 


G00975 


Homo sapiens 


Human secreted protein, SBQ 
ID NO: 5056. 


480 


100 


1579 


AF243744 


Cryptosporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585H4.2 (novel protein 
(translation of CDNA 
Em:AK000219) ) 


£63 


100 


1581 


AF041B53 - 


Homo sapiens 


Jcinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein OIP5 


1198 


io6 


1583 


AE001803 


Tharmotoga 
maritima 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane protein FLRT1 


3494 


99 


1586 


AF116274 


Homo sapiens 


DNb-5 


2628 


97 


1587 


X79440 


Homo sapiens 


NADP+- dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


t lavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


ALl}9314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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SEQ 
ID 

NO: 


ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb649_3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


^\ m til /*a r 

AF174605 


Horao sapiens 


F-box protein Fbx25 


1408 


99 


159B 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Horao sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Horao sapiens 


gpSta£50 ' 


2305 


100 


1601 


Y00876 


Horao 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA- interacting protein 3 


2821 


99 


1603 


AJ222801 


Horao sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Horao sapiens 


neutral sphingomyelinase 


1601 


99 


.1605 


AF185576 


Mus raus cuius 


POZ/zinc finger transcription 
factor ODA-8 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


IFN-pseudo- omega 2 


BOO 


98 


1608 


YS7949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1868 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


37*S 


100 


1611 


Y0820O 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


Arab idop sis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 


97 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- l, 
(CIPAR-l) 


890 


62 


1617 


X58079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 




Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


288 


100 


1621 


AJ007509 


Homo sapiehs 


ElB-55kDa-associated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AL3 55013 


Schi2osaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane-bound protein 
PR01198. 


1184' 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


Y35954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 {novel protein) 


470 


100 


1629 


AF132484 


Mus musculus 


unknown 


286 


68 


1630 


AF017096 


Drosophlla 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF1510B4 


Homo sapiens 


HSPC250 


7*3 


100 


1633 


AJ001874 


Homo sapiens 


or£ 


255 | 


97 


1634 


AC0121B7 


Arabidopsis j 
thaliana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 
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WATERMAN 
SCORE 


IDENTITY 


1635 


AF026246 


Homo sapiens 


~HERV-E integrase 


411 


90 


1636 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 
ve8_l derived protein. 


1125 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


" 99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l 1 protein sequence SEQ 
ID NoTso . 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Drosophila 
melanogaster 


WDS 


358 


26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat -containing F-box 
protein FBW5 


2*7* 


88 


164S 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42. 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-l 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


l*4d 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand {clone 3TW) . 


4137 


100 


"1650 


AC00713 6 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 


AL161576 


Arabidopsis 
thai i ana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) " 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJi84J9.l (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
cn discoideum 


mycM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiqui tin- specif ic protease 


137 


35 


1660 


AL078^27 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 act in 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 




AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


2tj 


1666 


AF17738* 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_JL 


1581 


47 


1668 i 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


p40 


397 


43 
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SEQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






aa 









1669 


Z99753 


Schi zosaccha 
pombe 


putative N0Ll-KOP2-sun family 
nucleolar protein 


569 


47 


1670 


G03130 


rtarao sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 




Homo sapiens 


polycomb 3 


200S 


99 


1673 - 


Y51B46 . 


" Homo sapiens 


Human 18.1 homolog protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP35 


" is** ' 


29 


io / o 


Y94 B67 


Homo 
sapiens 


Human protein clone HP10563. 


""109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


3043 - 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


1580 


91 


1678 

1 C7Q 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 j 


xo / j 


Arlt>3151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein " 


1349 


100 


Xbol 


AF019236 


uictyosteliu 
m discoideura 


TipD 


613 


34 






Leishmania 
major 


proteophosphoglycan 


""153 


26 


1683- 


Z69369 


schi zosaccha 

romyces 

pombe 


putative GTP- binding protein 


5*0 


46 


1684 




Homo sapiens 


ERp28 


1334 


100 


1685 




Takifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator- like protein 


196 


19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 




Homo sapiens 


transcription factor 


2956 


100 


1688 




Homo sapiens 


transcription factor 


1886 


88 " 


1689 


X07311 


Drosophila 
melanogaster 


heat shoclc protein 


138 


43 


1690 




Rattus 
norvegicus 


LISl-interacting protein 
NUDE1 


1383 


83 


1691 


AiT27207fl 


Homo s ap i en s 


apohkc-1 stimulating protein 


1256 


68 


1692 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


1336 


60 


1693 


rve ±. I t y±£ 


Xenopus 
laevis 


katanin p60 


1664 


66 


1694 


AF263539 


Homo sapiens 


arginine N-methyl transferase 


1774 


100 


1695 


AF , 222£rq 


Homo 

aaw{ Ann 

sapiens 


protein arginine N- 

methyl transferase 1- variant 2 


1182 


81 


~UsZ — 


AK000193 


nouio sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


noma sapiens 


kidney superoxide -producing 
DiAvvii oxi case 


3122 


100 


169B 


AB04103S 


Homo sapiens 


kidney superoxide- producing 
wAUrn oxidase 


2181 


100 


1699 


AF025772 


Komo sapiens 


C2H2 zinc ringer protein 


488 I 


54 


1700 


Y44676 


Homo sapiens 


Human arf- Related Protein- 1 

IHARP-IJ . 


938 


97 


1701 


AK022407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AF055078 


Homo sapiens " 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Kus rausculus 


RP42 


1057 


77 1 


1705 


AE003573 


Drosophila 
melanogaster 


CGI 2 4 74 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin "~ ] 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 , 


1705 


At391710 " 


Arabidopsis 


putative protein 


505 


50 
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TABLE 2 



SEQ 
ID 

NO: 

1710 


ACCESSION 
NUMBER 

B01311 


SPECIES 

thaliana 
Homo sapiens 


bESCRiPTiON . 
Human PR0241 polypeptide. 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


1711 
1712 


U40750 
AJ011118 


Mus niusculus 
Mus mus cuius 


formin binding protein 30 
skeletal muscle and cardiac 
protein 


1649 
4561 
1490 


97 
" 89 


1713 


AF255303 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 


1714 


"AP2S5303 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


2960 


100 


1715 


U6sii7 ■ 


Rattus 
norvegicus 


Ras- related protein 


511 


51 


1 *7t £ 

± /lb 


AF168795 


Rattus 
norvegicus 


schlafen-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO- 1- specific protease 


S804 


99 


1718 


AL355737 


Homo sapiens 


HMGiOA 


1782 


100 


1719 
1720 


AB029333 
AF071317 


Halocynthia 

roretzi 

Mus musculus 


HrPET-1 

C0P9 complex subunit 7b 


1069 
1297 - 


46 
$7 


1721 
1722 


AJ272215 
G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063 . 


1681 
718 ■ 


99 
100 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to Uncharacteri3ed 
protein family UPF0034, 


825 


*41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6053. 


SBfJ 


92 


1725 

1726 
1727 


Y94441 

AF2S5443 
AF183426 


Homo 
sapiens 
Homo sapiens 
Homo sapiens 


Human Adipose Specific 
Protein 1. 
CGI -201 protein 
HT004 protein 


1231 
4397 


100 
99 


1728 
1729 


D10884 
Z18529 


Bos tauniR 

Gallus 

gallus 


iicurocaicin 
tensin 


1810 
1002 
1411 


99 
99 
84 


1730 
1732 


Z73423 
AF090B91 


is elegans 
Homo sapiens 


\*LfEu\ Etoi EdFiDu i /ii4 sob comes 
from this gene-cDNA EST this 
gene 
PR6616S 


233 
470 


41 
30 


1733 
1734 

1735 


A^77724 
G04050 

D45913 


Homo sapiens 
Homo sapiens 

Mus musculus 


histone deacetylase 8 
Human secreted protein, SEQ 
ID NO: 8131. 


2015 
503 


100 
95 


1736 
1737 


AF096709 " 
AF195120 


Drosophila 
virilis 
Homo sapiens 


leucine-nch-repeat protein 
failed axon connections 
protein 

dynactin p62 subunit 


3531 
276 ■ 


94 
32 


1738 


M.5314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


2417 
206 


99 
37 


1739 


X54618 


listeria 

monocytogene 

s 


phosphadidyiinositol specific 
phospholipaee C 


134 


27 


1740 


AL031658 


Homo sapiens 


dJ3!0O13.4 (novel protein 
similar to predicted C. \ 
elegans an C. intestinalis 
proteins) 


123 


ll 


1741 


Y35924' 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173. 


1013 


99 


1742 


AC013354 


Arabidopsis 

thai i ana 


F15H18.15 


202 


*ii 


1743 


W75771 ' 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1932 


59 


1744 


W75771 


Homo ~" 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224" 


70 


1746 
1747 


Y99372 
Y94294 


Homo sapiens 
Homo sapiens 


Human PR01430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 
Human coenzyme A-utllislng 


1332 
842 


§9 
1O0 
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1 secT 

ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


1 DESCRIPTION 
enzyme coAEN-2 . 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


1748 
1749 


AK024436" 
AE000877 " 


Homo sapiens 
Methanobacte 
rium 

ophicum 


FLJ0 0026 protein " 
conserved protein " 


1619 
231 


100 
"36 


1750 
1751 


AF101361 • 
Y15367 


Drosophila 
mslanocrant'fir 

Homo sapiens 


Abnormal X segregation 

ZNF232 ~ ~ 


"193 
889 


33 
100 


[1752 
1753 

[j.754 


AF251038 
AC003093 

X69089 


Homo sapiens 
Homo sapiens 

Homo sapiens 


GAP -like protein 
OXYSTEROL - BINDING PROTEIN; 
45% similarity to P22059 
{PID:gl29308) 
i65kD protein 


822 
352 

5703 


100 
57 

99 1 


1755 
1756 

1757" 


AL049795 
AL031393 


Homo sapiens 
nuiuu b ap x ens 


0J622L5.3 (novel protein) 
dJ7JJljAb.i Uinc-ringer 
protein) 


"1039 
2765 


100 1 
100 1 


1 1758 
1 1759 


AB046^7i> 

' AL022238 
AF117653 


Homo sapiens 

Homo sapiens 
Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 
se 

OJ1042K10.4 (novel protein) 
double homeobox protein ' " 


2020 
776 


99 I 
43 1 


J 1760 
1761 

| 1762 


■yrrs^i 

AL049712 


Homo sapiens 


hNop56 

dJ686C3.2 (nucleolar protein " 
hNop56) 


375 

2959 

2595 


54 ] 
99 1 
99 | 




AC002394 


Homo " 
sapiens 


Gene product with similarity 
to dyne in beta subunit 


1542 


51 


1763 


"AF169017 


Homo sapiens 


rormiminotrans f erase 
cy cl odeamina s e 


877 


100 


1764 


U91541 


Homo sapiens 


human rormiminotranslerage 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


596* 


100 


1765 


AB013365 


Bacillus 
halodurans 


" 7 YTqF * — 


350 


34 


[176" g 


Vfi8421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 ~H 


1767 


AC009176 


Arabidcpsis 
thaliana 


putative ribulose-1,5- 
bisphosphate 

carboxylase/oxygenase small 
suounic n- methyl trans re rase I 


21* 


27 


J 1768 
1769 
1770 

1 1771 


AK000647 
AJ238982 
U73522 
U89435 


Homo sapiens 
Homo sapiens 
Homo sapiens 
rcus musculus 


unnamed protein product 

VNN3 protein 

AMSH 

unJcnown 


737 
2665 

iii4 


99 1 
99 

5* 1 


J 1772 
1 1773 
1774 

J 1775 


S70011 

AL035086 

Y99426 

AF110330 


Rattus ep. 
Homo sapiens 
Homo sapiens 

Homo sapiens 


tricarboxylate carrier 
dJ44A20.2 (novel protein) 
Human PRO1604 (UNQ785) amino "" 
acid sequence SEQ ID NO: 3 08. 
glutaminase 


8*0 
1604 
2036 
1057 

3146 


86 " 
35 J 
100 j 


100 1 


1776 
1777 

1778 


AJ269529 
Z81579 

AY007239 


Caenorhabdit 
is elegans 
Homo saoipnq " 


glycerol 3 -phosphate permease 

cDNA EST yk7tf£i.5 comes fr6m 

this gene 

monooxygenase X i 


2787 
232 


100 "j 

31 


1779 
1 178D 


AL109608 
AF254260 


Schizosaccha 

romyces 

pombe 

Homo sapiens " 


oxyscerol-binding protein 
family 

turtelin l " " 


1875 
644 


99 j 
38 H 


1781 


107924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


1729 
247 


10(5 

SO 


1782 
1783 


AF295773 
AK024475 


Homo 
sapiens 

Homo sapiens ~ 


ral guanine nucleotide 
dissociation stimulator 
FLJ00Q6B protein 


142 
4333 


49. 

100 "| 


1784 
1785 

1 1786" 


AK024475 
G03933 " " 

S8i637 


Homo sapiens 
Homo sapiens 

Homo sapiens 


FW00068 protein 

Human secreted protein, SEQ 

ID NO: 8014. 

lg lambda- liJce gene/beta- | 


3996 
570 

247 


93 -j 
100 

100 j 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


V 

IDENTITY 








glucuronidase exon 11 homolog 







TRADOCS: 1 4 1 6280. 1 C%CT40 1 !. DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 




2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.250e- 
12 157-181 




3 


PRO 0109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 8.085e- 
13 358-381 




4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.40Ge- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 




5 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
3 90 




6 


HL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8."920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 




7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 




' 8 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 




9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19. £4 £.119e- 
09 863-917 




"10 


PR00464 


E-CLASS P450 GROUP II 

SIGNATURE 


PR00464D 17.40 4.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 177- 
393 


11 


PRO0734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.2$Se- 
09 502-520 


12 


PFO0023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DMO0031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.648e- 
09 79-113 


15 


PR00208 


GLIADIN AKD LWW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 $.668e- 
10 Si?- 1 *'!** ppnnortoa 

12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BLQ0845 


CAP-Gly domain proteins. 


BL00845 16\43 2.200e- 
25 55-80 


20 


BXj00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 




IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 S.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLQ0107A 18.39 3.250e- 
26 302-333 
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SEQ ID NO: 


ACCESSION 
NO. 


ASCRIPTION 


RESULTS* 


23 


SL00107 


Protein Kinases ATP- 
binding region proteins. 


BL00107A 18.3 9 3.250e- 
26 302-333 




BL00115 


HuJcaryotic rna 
polymerase II 
heptapeptide repeat 
proteins. 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BLOOllSY 11. 86 
8.000e-l7 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4,336e-13 535-590 
BLOOllSL 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.0Ile-13 435- 
463 BL00115K 15.03 
3.417B-10 617-659 
BL00115O 16.76 5.805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BLOOllSS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420A 26.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


21 


BL00050 


Ribosomal protein L23 
proteins. 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


26 


PR00S25 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


2d 


PFD0756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins . 


BLO0S57D 17.7* 5.6£5e- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 1.000c- 
28 227-257 BL00557B 
21.27 8.B98e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629B 9.90 5.886e- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


Kb CEP TOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455B-30 137- 
166 


36 


PD01270 - 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- *" 
40 39-79 PD01270B 
22.18 2.875e-30 94-131 



192 



WO 01/53312 PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19,54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412C 10.28 9.24le- 
10 264-298 


38 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412C 10.28 9,241e- 
10 264-298 


40 


PRO038O 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets -domain proteins. 


BL00345B 21.28 1.000s- 
40 239-290 BL00345A 
13 .96 2.452e-l4 204- 
223 


45 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


lew OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.S71e-17 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


PR0Q876 


NEMATODE METALLOTH I ONE IN 
SIGNATURE 


PR00876B 7.66' 9.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BLO0972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-l0 302-312 


51 


BL60972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972D 22. SS 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7 . OOOe- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PRO0988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 B.SOOe- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 j 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.9l5e- 
09 57-69 


55 


PR00742 


CHLORIDE CHANNEL 
SIGNATURE 


ttWdttac 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 



193 



WO 01/53312 



PCT/US00/34263 



SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors. 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PRO0360A 14.59 7.395e- 
09 670-683 


70 


PF00S51 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PFO0651 15.00 8.714e- 
10 51-64 


72 


DM00119 


w KINASE ALPBA ADHESION 
T-CBLL. 


DMO0179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


"74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 *.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11 73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 

PHOS PHATI DYLSEH INE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.5886-12 334- 
351 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLS ERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-l2 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PR00014 | 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8,043e- 
09 985-1004 


86 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PRO0320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL004S5 


Putative AMP-blnding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
blnding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RiEBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
9.93 7.188e-20 613-635 
PR00380B 12.64 7,517e- 
16 529-547 PR00380C 
13,18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP- DEPENDENT CLP 
PROTEASE ATP- BINDING 
SUB UN IT SIGNATURE 


PR00300A 9.54 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12.57 6.294e- 
12 181-197 


106 


"BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


"DM01970 


0 Jew ZK632.12 YDR313C 
END0SOMAL III. 


DM01970B 8.£0 5.000e- 
16 403-416 


108 


BLOC 191 


Cytochrome bS family, 
heme -binding domain 
proteins . 


3L00191K 17.38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


"109 


PD01066 


PROTEIN Z*NC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PO01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10.96 B.297e- 
10 38-50 


113 


"BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-l4 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins. 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-U 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLO0107A 18.39 B.S^Oe- 
13 36-67 


119 


PR00S29 


GONADOTROPHS RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9-400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL0021SA 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8 . 902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl-CoA -binding 
protein. 


BL00880 17.52 5.57^e- 
26 72-122 


134 " 


BL00030 


Eukaryotlc RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


136 


BL01310 


ATP1G1 / PLM / MATb 
family proteins. 


BL6i3l0 14.74 2.43Ue- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.8 82e- 
14 214-231 BL00028 
16.07 9.4716-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 5.500e- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 B.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 158-175 


141 


BL0O5O1 


Signal peptidases I 
serine proteins . 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BIi00126 


3 ' 5 ■ -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 B.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- , 
479 


151 


Bt00632 


Ribosomal protein S4 
proteins. 


BL00632 23.79 S.271e- 
20 106-149 


154 


"BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Acting proteins. 


BL00406D 12.58 2 . *47e- " 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc-binding region l 
proteins. 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL00362 


Ribosomal protein SI 5 
proteins. 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL00639 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 l.OOOe- 
35 640-686 BL00O39A 
18.44 1.964e«13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PD01066 


protean"' itttfc PXnGeSr 

ZINC-FINGER METAL- 


PDOIOSS i9."« &.455e- 
36 6-45 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




180 


PR00007 


COMPLEMENT C1Q DOMAIN 
S IGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-l9 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


181 


BL00027 


' Homeobox 1 domain 
proteins. 


BL00027 26.43 9.526e- 
24 280-323 


182 


DJjU \J\J£ f 


1 Homeobox 1 domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


133 


BL00027 


'Honeobox' domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


'Homeobox' domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT - HOOK- L I KE DOMA.1N 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.7l4e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PFOOS54 


Oct icosapep tide repeat 
proteins. 


PF00564B 24.74 6.164e- 
16 227-278 




PRO 05 03 


BR0K0D0MAIN 5IGNATURB 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.571e-13 170-187 


195 


BL00901 


Cysteine 

synthase/cystathionine 
beta -synthase P- 
phosphate att. 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dnatf domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866a-"" 
09 463-482 


199 


BL01131 


Ribosoraal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 

1 O C AO CiO 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 ■ * 


PR00261 


WW DENSITY LIPOPROTEIN 

(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4. 4626- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.065e-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261B 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B iB.49 6.143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.73le- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PRO0007D 9.64 7.231e- 
11 233-244 


212 


BLO0183 


Utoiqultin-conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


213 


BL00183 


enzymes proteins. 


BL60183 28.9*7 1.545e- 
30 43-91 


215 


BL00039 


DEAD -box subfamily ATP- " 
dependent helicases 
proteins. 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 


217 


BL00100 


Chlorampheni col 
acetyl transferase 
proteins. 


BL00100D 17.22 B.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


PR00213C 15.94 3.969e- ' 
11 199-227 


222 


BLO0678 


Trp-Asp (WD.) repeat 
proteins proteins. 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875' 


MOLLUSC METALLOTHIONE IN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.20Oe- 
19 18-39 


226 


BLOOSSe' 


Nt-dnaJ domain proteins. 


BL00636A 8.07 l.OOOe- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 
"230 


PR00301 , 


70 KD HEAT SHOCK PROTEIN ' 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G * 
13.78 4.300e-12 361- 
382 




BL004i6*0 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


cyclins proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR00449 

] 


rRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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NO. 
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RESULTS* 








17.27 4.462e-U 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


23S 


PRO 0019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


" PR00019 


LEUCINE-RlCH REPEAL 

SIGNATURE 


PRO0O19B 11.36 7.300e- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
l.OOOe-08 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPS AT PRESYNA. 


PD00289 9.97 8.448e-09 
67-81 


240 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00O11 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidir.e and 
deoxycytidylate 
deaminases zinc -binding 
region s. 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 8.043e- 
09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.QOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


256 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 . 


BL00674 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6\045e- 
09 61-88 


255 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002B 15.18 2.800e- 
10 421-435 


"258 


PR00094 ; 


ADENYLATE KINASE 
SIGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- " 
13 60-91 


2*2 


BL00388 


Proteasome A- type 
subunite proteins. 


BL00388A 23.14 l.OOOe- 
40 8-S4 BL0O388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


2*4 


BL00903 


Cytidine and 
deoxy cy t idyl ate 
deaminases zinc-binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL001O7B 13.31 1.529e- 
09 241-2S7 


270 


BLO0226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BLQ0226B 
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23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e- 
15 96-111 


271 


PD029S2 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI . 


PD02952C 15.76 9.731e- 
16 235-265 PD029S2B 
15.57 5.625e-09 215- 

229 1 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


274 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.4B6e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


111 


BL00052 


Ribosomal protein S7 
proteins . 


BL00052A 27. 8$ S.OOOe- 
13 137-184 BL00052B 
15.17 5.143e-12 208- 
235 


273 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 

( TRANS DUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 

13.41 1.000e-21 89-105 
PRO 03 19 A 1 5 97 ft ifidp- 

21 51-68 PR00319B 
11.47 8.200e-19 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 94-112 PRn0319p 

13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease. 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL0002B 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16/07 £.506e- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 S.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 f 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 B.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glycosyl transferase. 


" PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-2S 102- 
129 PF00953B 6.17 
1.000e-13 182-194 


304 


PF00152 


tRNA synthetases class 
II. 


PF00152D 21.30 B.364e~= 
28 422-461 PF00152C 
28.03 9.250e-2l 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
2 INC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEOPROTE I N . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


S¥4 DOMAIN SIGNATURE 


PR664*4d ii.24 7.866e- 
09 1167-1186 


308 


PRO 023 7 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 S.OSle- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 7.S77e- 
24 315-339 BL00522P 
14.90 1.310e-lS 470- 
494 BL00522A 25.52 j 
1.265e-14 179-226 
BL00522E"19.63 8.615e- 
14 430-460 BL005^2B 
27.30 9.625e-12 267- 
313 


310 


BL0u32* 


Tropomyosins proteins. 


BL00326D 8.76 5.235e- 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4.706e- 
14 151-174 BL00290B 
13.17 S.00De-12 211- 
229 


313 


BL00345 


Ets- domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL0Q345A 
13.95 9.217e-16 1-20 


31* 


PFO0651 


BTB (also Known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.l98e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


• Homeobox ' domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Linlc domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-U 516- 
567 BL00412D 16.54 
7.848e-10 51B-5S9 
BL00412D 16.54 1.827e- 
09 514-565 BLO0412D 
16.54 1.9l8e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL66232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.5S7e- 
20 151-199 BL00232B 
32.79 2,246e-18 41-89 
BL00232B 32.79 5.985e- 

1ft 1*70-41 R TVT.nn^lOA 

32.79 5.500e-16 258- 
306 BXj00232£* 19 79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR0045"4C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins. 


BL01016C 22.84 3 . 925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3. 29 Se- 
ll 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
8.8SSe-09 38-50 


339 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.500e- j 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


"BL6il^0 


Kinesin light chain 
repeat proteins. 


BL0116OB 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-66 


Ut 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12-27 4.764e- 
11 135-154 


347 


PRO 010 9 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


" RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 1? fl>4 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT P^tEIN 1 ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
• 10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins. 


BL00380F £.7t> 6.694e- 
11 542-553 


355 


PF00628 


PHD- finger. 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PRO0S87A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 4.462e- 

1 ^ oci o*Jil nnn t\t\rr 
Xo abX~Z/ / k JrUUDOoo 

13.92 6.500e-13 233- 
246 PD00066 13.92 

A 1 Arti. AG *i C» a. 1 f\ *\ 
9 i JUUC'UJ £07"J02 


361 


PF00791 


Domain present in ZO-l 
and Unc5-like net r in 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 

rrUU 2/,8*> 1.432e- 

09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-l 
and Onc5- like netrin 
receptors . 


"PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 " 


RECOVBftlN FAMILY 1 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PRQfMQnr* 

12.22 3.278e-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- " 
09 22-68 


365 


PF^60242 


DNA polymerase (viral} 
N-terminal domain 
proteins. 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL0116O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 103B-1092 


3«7 


PRO 001 9 


LEUCINE - RICH REPEAT 
SIGNATURE 


09 229-243 PR00019B 
11.36 6.040e-D9 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24,25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.1S0e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750"<r 
12 410-425 




PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


pr66i?6e ^.48 2TfI55= — 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


""'ftESULTS* 








10 88-118 


3B0 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


391 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 S.714e- 
12 50-66 


382 


PR00624 


HIS70NE H5 SIGNATURE 


PR00624G 4.08 4 . 900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00O78B 13.14 5.9$0e- " 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PDO2870 


RECEPTOR INTRRLEUKIN-l 
PRECURSOR . 


PD02870B 18.83 6.000e- " 
10 97-130 


388 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 S.OOOe- 
13 316-529 


383 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.657e- 
09 151-174 


390 


BL00215 


Mitochondrial energy- 
transfer proteins . 


BL00215A 15.82 5.200a- 
15 221-246 BL00215A 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA -protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PRO 004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.579e- 
11 141-155 


398 


PROO^l 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF0067* 


Dehydrogenase El 
component . 


PF00676H 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00S14C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


PF00992A 16.67^ 5.474e- 
09 105-140 


404 


PRO0O19 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 B.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 SO-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL0023 2 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.5S7e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEO ID NO • 


NO. 


nt7 C CO T DTTOM 


"Dt?CTTT TCt 








9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.26le-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PP00426 


Outer Capsid protein VP4 
{Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine - nucleot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


41X 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BLO0866 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


4X8 


PR00239 


MOLLUSCAW RttODOPSIN c- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like net r in 
receptors ♦ 


PF00791B 28.49 7.95Se- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 FF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 

20.98 5.235e-09 170- 
209 PF00791C 20.98 

PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 43S- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1S4S-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


Bli00039 


DEAD- box subfamily ATP- 
dependent he 1 leases 
proteins . 


BL0O039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
B.920e-16 251-277 
BL00039C IS. 63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.6-5 7.652e- 
12 169-185 ; 


433 


PR00828 


FORMIN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapslns proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e-' 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 

MA 
XV VJ . 


DESCRIPTION 


RESULTS* 






P15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PROOS^fl 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5\551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
(SCR repeat proteins. 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL007901 20.01 2.82le- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0038OA 14.18 l.OOOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GAB A) RECEPTOR 
SIGNATURE 

• 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 S.950e- 
21 452-473 


*kx> / 


PRO 084 9 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


H f A 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00618 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.72le- 
09 282-330 


"473 


BL00344 


GATA-type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.67 8.909e- 
09 173-199 


479 


PRO 03 19 


BETA G- PROTEIN " 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV R^V INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR0040SA 17.71 
4.971e-18 411-431 


482 


PRO 004 9 


WILN'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 

• 


PR00007B 14.16 S.fc'lSe- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.8*3e- 
09 200-214 


488 


PR00988 


URIDINE KINASE SIGNATURE 


12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


1*R00049D 0.00 7.8& i 4e- ~ 
09 663-678 


"492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


15NV polyprotein (coat 


PF00429 31.08 7.171e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins. 


BL06120B ll.il 7\923e- " 
09 185-200 


500 


"*BLu0030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e~ 
17 492-510 


508 


PR00120 


H-f TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR0012OC 9.90 5.800e- 
19 705-722 


509 


DM01417. 


6 Kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


Sifl 


PF00534 


GlycosyX transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group 1. 


PF00S34B 14.47 6.625e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHOR YLASE KINASE 
ALPHA MUSCL. 


PD01B41A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2.909c- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.3B6e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL06740A 13.87 7.188e- 
12 410-423 


516 


DM60892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BLO0242 


Integrins alpha chain 
proteins . 


BL00242C 16.36 8.320e- 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


£25 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e- 
10 61-95 


525 


PF007B9 


Domain present in 
ubiquit in- regulatory 
proteins. 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Qulnone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 1.500e- 
16 120-164 
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SEQ ZD NO: 


ACCESSION 
NO, 


j DESCRIPTlbN 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORP6 PROTEIN 
SIGNATURE 


PR0091OA 2.51 3.893e- 
09 60-73 


532 


BL0021S 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15". 8i 4.006e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
14 8 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl-enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


53* 


PR00370 


FLAVIN- CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- " 
22 321-340 PR00370D 
16.33 6.143e-21 185- [ 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.3S 
6.442e-17 4-20 


536 


BL00028 

• 


Zinc finger, C2H2 type, 

domain nirr>t"*» , inR 


BL00028 16.67 7.429e- 

XO tSl»U0028 

16.07 6.294e-14 341- 

1.346e-ll 369-386 
BLOC028 lfi 0*7 1 coop. 
11 397-414 BL00028 
16.07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.23le-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


53 7 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL0076"2 


WHEP-TRS donain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR0O985A 12.10 9.000e- 
10 357-375 


541 


PD02102" 


SUB UNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 8.000e- 
31 293-325 BL0O2S0B 
27,37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR0O*19B 11.47 i.714e-" 
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SEQ ID NO : 


ACCESSION 
NO. 


"" DESCRIPTION 


RESULTS* 








(TRANSDUCIN) SIGNATURE 


09 106-201 PR00319A 
15.27 7.344e-09 210- 
227 




548 


BL01204 


NF- kappa -B/Rel/ dorsal 
domain proteins. 


BL01204A 17.74 l.OQOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 




549 


PR00326 


GTPl/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 




551 


PF00632 


HECT-domain (ubiquitin- 
transf erase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e~21 1515- 
1543 




554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- 
14 1B7-205 BL00290A 
20.89 2.059e-14 130- 
153 




557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 




559 


DMollll 


"4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 




562 


PF00658 


Poly-adenylate binding 
protein, unique domain 
proteins. 


PF00658C 16.33 9.455e- 
32 118-155 




"TO 


kl.00141 


Eukaryotic and viral 
aspartyl proteases 
proteins. 


BL00141A 12.10 4.150e- 
10 472-488 


"566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.657e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.977e- 
13 225-268 


"569 


BL0O1O7 


Protein kinases AT?- 
binding region proteins. 


BL00107A IB. 39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins. 


"BL00107A 18.39 ^.bOOe- 
19 118-149 BL00107B 
13.31 5.S00e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR00193 


MYOS'irt HEAVY CHAIN 
SIGNATURE 


PR00193D 14.3$ 1.8$7e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A IS. 41 2.588e- 
22 115-135 PR00193B 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BLOOllS 


DNA polymerase family B 


Bl,O0116A 12.81 S\737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 
11.82 1.529e-12 952- 
965 


578 


BL00195 


Glutaredoxln proteins. 


BL00195B 15.31 7.158e- 
09 121-141 j 


"579 


PR00019 


LEUCINB-RICK REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920s- 
09 363-377 


580 


PRO 0253 


GAMMA-AMINOBUTYRIC ACID"" 
(GABA) RECEPTOR 
SIGNATURE 

i 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPER FAMILY 
COMPLEMENT - BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HBLICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15.14 
3.1B6e-ll 784-004 


586 


PFC0013 


KH domain proteins 
family of RNA binding 
proteins. 


PF00013 5.78 1.450e-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.409e- 
13 262-296 


589 


BL00478 


LIM domain proteins. 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


~590 


PF00855 


PWWP domain proteins. 


PFooa^s 13.75 s.oooe- 

15 931-948 


591 ■ 


PF008S5 


PWWP domain proteins. 


PF00855 13.75 B.OOOe- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


PF00628 15.84 3.455e- "~ 
12 424-439 


594 


PR00205 


CADHERIN S^NATURE 


PROO205B 11.39 2.241e- 
16 558-576 PR0020SA 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteino. 


BL00107A 18.39 4.789e^ 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 


600 


BL00242 


Integrins alpha chain 
proteins . 


BL00242B 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.1l5e-26 286- 
316 ( BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








$.0006-11 *i-73 
BL00242D 13.57 4.986e- 
10 291-316 


601 


PR00320 


G- PROTEIN BETA" WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 5.610e- 
09 198-213 


602 


' PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.*6 l 9e- " 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacyl glycerol binding 
domain proteins. 


"BL00479C 12.01 3.250e- 
12 170-183 


604 


BL00315 


Denydrins proteins. 


BL00315A 9.3* l.£72e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 S\l67e- 
15 265-282 


609 


PP00855 


PWWP domain proteins. 


PF0O855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 

• 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.291e-09 767-787 


615 


PD02699 


PROTEIN DN A- BIN DING 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.0Q0e-l7 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 • 


617 


PRO 03 80 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.97Se-13 436- 
455 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM012G6B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


cit ™ " 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 3 . 160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.1$ 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC MOLYBDO PTERIN" 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory-chain nadh 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID 150: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins. 


24.37 1.000e-40 255- 
308 BL00641P 33.12 
l.OOOe-40 571-623 
BL00641A 17.15 1.818e- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


Ml 


PR00103 


CAMP- DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2,500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 i 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


530 


PR00081 


GLUCOSE/ttlBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR0O0B1A 10.53 d.ille- 
16 4-22 


631 


PP00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 B.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUC LEO CAPS ID- 
PROTEIN. 


DM01206B 10. £9 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


*35 


3L00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins . 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD-finger. 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129B 13.25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 


649 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


•Homeobox 1 domain 
proteins . 


BL06027 2^.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SK3) 
domain proteins profile. 


BL50002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA- AM INOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR002S3B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE . 


PD01719A 12.89 4.4S2e- 
11 969-997 PD01719A 

156 PD01719A 12.89 
7.39Se-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-lC 550* 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASB 
SIGNATURE 


PR00688I 13.78 9.518e- 
09 224-236 


661 


BL00027 


■Homeobox' domain 
proteins . 


BL00027 26.43 S.950e- 

T) r>A Q (inn 


662 


PRO 03 60 


C2 DOMAIN SIGNATURE 


PR0036OB 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00350B 13.61 7.158e- 
10 596-610 


664 


PRO 03 6 0 


C2 DOMAIN SIGNATURE 


PR0036OB 13.61 7.158e- 
10 596-610 


666 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.98Qe- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


668 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BLO0018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP-BINDING TRANSPORT 
TRANS MEM BR . 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PRD0667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PRC0667G 15.33 '7. ^57e- " 
10 106-123 


674 


PRO 0320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 

1 "3 1 Q A 11Co.11 ct c_ 

650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.2S0e-09 S93-608 


675 


PR0O320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-U 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PRO 00 19 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9,667e- 
09 249-263 


679 


PF00642 


zinc finger C-x8-C-x5-C- 
x3-H type {and similar) . 


PF00642 11.59 3.700e- 
16 225-236 PP00642 
11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins . 


BL00019D 15.33 4.200e- 
19 227-257 


£82 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


^87 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phospnatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A l6.26" l.OOOe- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
Hv lib-loo BL01024D 
13.22 1.000e-40 185- 

£,£.£ OljUXUz41£ 11.96 

I. 000e-40 222-266 

40 266-317 BL01024G 

II. 09 1 000e-4O 11*7- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


♦Homeobox* domain 
proteins. 


BL00027 26.43 8.d)71e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL002I1 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL00680 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL0068O 14 37 5 
17 173-195 


697 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14,27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 B.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00S23 


Sulfatases proteins. 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909c- 
15 86-98 BL00523C 
12.64 5.500e-13 137- \ 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BLon«?5^n q oq 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


"703 


PR00048 


C2H2-TYPE ZINC "FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e~10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007D7A 14 94 fl" <J41»_ 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- 
10 822-841 


712 


DM01354 


kw TRANS CRi PTASE REVERSE 
II ORF2. 


DM01354Y 10.69 4 . 977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
315 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 

10 ^K-T7C 

JLW J9Q*J f © 


713 


BL00039 


DEAD- box subfamily ATP- 
dep end en t helicases 
proteins . 


BL00039D 21.67 7.54Se- 

O "7 & C r\ _ A Q C DT nAftl on 

di t »su- h 7o au UuUJ 9 A 
18.44 2.S37e-18 147- 
186 BL00039C 15.63 
2.216e-14 280-304 
BLO0039B 19.19 1.947s- 
13 194-220 


715 


BL003&3 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383E 10.35 4.981e- ' 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-28 04-118 
DM00031C 12.79 1.300e- 
12 131-142 


713 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 

17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 

PROTEIN SIGNATURE 


PR00217C 10. 91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- " 
34 13S-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
B.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


"RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR0D704C 11.88 1.87le- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.^52e- 
09 169-187 


726 


PRO 019 4 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 2.12Se-'~ 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR0032OC 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6 . 586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-l0 277-292 


731 


PR06i9* 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


"733 


PF00642 


Zinc finger C~x8-C-x5-C- 
X3-H type (and similar). 


PF0O642 11.59 9.082e- 
10 787-798 


738 


BLOO039 


DEAD -box subfamily ATP- 
dependent he li cases 
proteins , 


BL00039A 18.44 2,565e- 
28 26-65 BL00039D 
21.67 2 ,105e-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


'739 


BL012B9 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BI>01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.07Se- 
12 41-81 | 


743 


BLD0965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 

• 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-2l 60-78 


748 


BL00612' 


Osteonectin domain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR0O45O 


RECOVERIN FAMILY 
SIGNATURE 


PR0045OC 12.22 6.880e- 
10 135-157 


752 


BL00795 


Involucrin proteins . 


BL00795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


BIiOOOSl 


Ribosomal protein L39e 
proteins. 


BL00051 20,92 1.935e- 
16 4-50 


755 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOS0MAL III. 


DM0197OB 8.60 7.723e- 
09 171-184 


"760 


BL0102O ■ ■ 


SARI family proteins. 


BL01020C 15.35 9.020e- 
12 99-150 


762 


3L00046 


Histone H2A proteins. 


BL00046 12.95 l.booe- 
40 33-88 


763 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


•Homeobox* domain 
proteins. 


BI,00027 26.43 8.800e- 
29 417-460 


767 


BLO1208 


VWFC domain proteins. 


BLO1208B 15.83 6-0*3e- 
10 309-324 BL01208B 
15.83 8.031e-10 165- 
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SEQ ID NO: 


ACCESSION 

. NO, 


DESCRIPTION 


RESULTS* 


"770 — " 






180 BL0120BB 15.83 
4.162e-09 85-100 




BLO0031 


Nuclear hormones 
receptors DNA-binding 
region proteins. 


BL00031A 19.55 9.571e- 
32 '208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PRO0449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR0044 9B 
14.34 3.455e-ll 27-44 


773 


BL0O523 


Sulzatases proteins. 


BL00523B 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
13 91-103 BIi00523D 
9.89 7.923e-12 224-236 
BL00523C 12.54 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 lo.07 7.686e- 
09 568-585 


77* 


BL00028 


Zinc finger, C2H2 type, 
domain proteins , 


BL00028 16.07 7.686e- 
09 621-638 


in 


BL0OO28 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 8.4l2e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 


779 


"PRO 0079 


GLUCOSE - 6 - PHCS PHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7 . 070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00209 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BL00690 


DEAH-box subfamily ATP- 
dependent helicases 
proteins . 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 


78* ~ 


PR00449 


TRANSFORMING PROTEIN P21 "" 
RAS SIGNATURE 


PRD0449C 17.27 S.S'OOe- 

i £ eft *)i GnnftAAan 

13.20 S.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 l.S45e-09 111- 
125 


788 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.6*9 8.767e- 
10 1-21 


790 


BLO0915 


Phosphatidyl inositol 3- 
and 4-kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


'fefaStttTS* 








22.78 5.050e-33 633-" 
671 BL0091SD 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


' PR00208 

/ 


GLIADIN AND LMW GLUTENjlN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 £.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PRO02O8A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12. S9 
7.658e-09 131-149 
PRD0208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
3 00 PR0020SC 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL0O412D 16.54 4.000c- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 1 . B27e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 


799 

• 


B 1,01052 


Calponin family- repeat 
proteins . 


BL01052C 18.51 l.OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-2S 174- 
194 


800 


BL0D348 


p53 tumor antigen 
proteins. 


BL00348P 23.19 3.714e- 
09 197-240 


801 


BL00309 


Vertebrate galactoside- 
binding lectin proteins. 


BL00309C 18.65 1.621e- 
09 62-87 


802 




OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PP00774 


Dihydropyridine 
sensitive L-type calcium 
channel (Beta subuni. 


PP00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 [ 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOS YSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


* RESULTS*' 






PHOTOSYNTHESIS . 




8X1 


BL00685 


CBP-A/NF- YB subunit 
proteins. 


BL00685B 14.41 6.779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PROD080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR0008OA 9.32 9.4l9e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC -FINGER 
METAL- BIND I . 


PD00066 13.92 7,923e~ 
15 158-171 PD00066 

XI QO ^ *)Ana_1J AC-CO 

PD00066 13.92 7.000e- 

■Jmt j.o— jjl rUUUUbo 

13.92 7.000e-13 130- 
143 pnnoofif? q*> 

J-^-> rlJUUUOD J.J. j^i 

7.500e-13 214-227 
PD00066 13.92 9.000e- 

J- J AWa— CUUV W59 

13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase ' 
proteins. 


BL01195C 20.12 3.348e- 


820 


BLC0520 


Interleukin-10 family- 
proteins. 


BL00520A 6.21 6.47le- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
f Bmi 1 v 2 nrnh p { net 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE M E TAL LOTH I ONE I N 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


flavoprotein protein 

DNA/PANTOTHEN. 


PD02855A 18.37 4.732c- 
28 88-124 PD02855B 
o.jo O. 3^86-09 1J2-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


"PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 

1 cn DDrtnmaD 11 ic 

3.8806-09 44-58 


832 


PR00011 


TYPE til EGF-LIKE 

SIGNATURE 


16 164-183 PR00011D 
14.03 6 B50e-lfi 164- 
183 PR00011A 14.06 
8.364e-14 164-183 
PR00011C 24.25 5.415e- 
12 231-260 PR00011D 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4 . 000e- 
10 290-304 


836 " " 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
R I B0NUCLE0 PROTE IN . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e<>14 449-467 
PR00700F 11.18 8.500e- 
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SEQ ID NO: ' 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 £.404e- " 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.9l5e-28 8-57 


845 


"BLC0826 


MARCKS family protein3. 


BL00B2*C" 7. £3 6.738e" 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL0OS18 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.S06e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00456B 22. £7 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 1 
BL0042OB 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 

Z.OUvc'13 OjU"OD3 

BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 * 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.32le-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.900e- 
13 3SS-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BLO0420C 11.90 2.83le- 
11 141-152 BL00420C 
11.90 S.ll9e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-l6 567-5^78 


857 


PR00388 


3 ',5' -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


8L00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


"861 


PR0098S 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.2S0e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-168 PR00988D 
5.95 8.2S0e-ll 163-174 
PR00988B 11.60 4 . 512e- 
10 60-72 


"863 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL0021SB 10.44 B.071e- 
12 41-54 \ 


"864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PRQ0775E 8 06 i nnn*»- ~™~" 
24 198-221 PR00775B 

PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 9.342a-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


844 


DM01688 


"2 POLY-IG RECEPTOR. 


DM01688G 16". 45 9.460e- 
09 B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL01287 


RNA 3 ' -terminal 
phosphate cyclase 
proteins . 


BL012B7A 17.95 2.68Be- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- " 
10 304-337 


872 


BL0004S 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biotin-requiring enzymes 
attachment site 
proteins . 


"BL00188 30.29 9.036e- 
32 665-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- ™ 
09 298-315 


877 ; 


PD02102 


SUBUNIT E V-ATPAS^K 
VACUOLAR ATP SYNTHASE 
HYDR0L. 


PD02102A 16". 74 4.1766- " 
10 97-141 


879 


BL01189 


Rihosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28. 5* 6.4Q0e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 

proteins. j 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOSPHATIDYL INOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.^85e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 """ 


PR0Q327 


ICE NUCLEATION PROTEIN 


PR603*7C 6". 37 5.24^e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


" RESUtTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DEAD -box subfamily ATP- 
dependent he 1 leases 
proteins . 


BLO0O39D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.K74e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC-FINGER 
METAL-BINDI. 


PD00066 13.92 8.200e- 
16 254-257 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-l6 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.2U0e-14 338-351 


902 


BL0111S 


GTP-binding nuclear 
protein ran proteins. 


"BL01115A 10. ii 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.58£e- 
25 335-356 PR00381B 
IB. 17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.0B4S-22 291- 
309 PR00381F 9.13 
3 2B8e~22 370- 
PR00381F 9.13 7.181e- 
13 286-30B PR00361E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-39B 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PRO0345C 4.54 8.557e- 
09 513-537 


908 


BL06678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins . 


BL01104C 15.14 6.000e- 
09 364-392 


922 


3L00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 
500-511 


923 


PR003 20 


G- PROTEIN BETA WD-4 0 j 
REPEAT SIGNATURE 


PR00320d 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 167- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDB 
REDUCTASE PHOTOSYNT . 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00019 


Actinin-type actin- 
binding domain proteins. 


Bt0001$C 14.60.45^- 
25 108-144 BL00019B 
13.34 6.S10C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.3736-10 34-45 


928 


BL00678 


rrp-Asp (WD) repeat 


BL00678 9. £7 &.308e-ll 



222 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins . 


273-2B4 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger], proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulose -phosphate 3- 
epiraerase family- 
proteins . 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 16.87 6.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BL01085 


Ribulose -phosphate 3- 
epimerase family 
proteins . 


BLC1085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01085C 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEliK REPEAT MUSCLE 
CALCIUM- B I . 


PD00301A 10.24 6.4Q0e- 
09 160-171 


936 


PF00168 


C2 domain proteins. 


PF00168C 27.49 4.000e- 
12 336-362 


937 


BL00415 


Synapsins proteins. 


BL00415N 4. 2^ 9.£l9e- 
10 5-49 


940 


PR00862 


PROLYL oLlGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


94 5 


BL01230 


RNA methyl transferase 
trmA family proteins. 


BL01230B 11 62 2 373e- " 
09 407-420 


94 8 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12. £7 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDORBDUCTASE 
NAD INTERGENIC RE. 


PJ01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956* 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyltransf erases 
proteins . 


BL00379 24.64 1.6"l0e- 
15 111-148 


959 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL0111SA 10.22 3.438e- 
14 110-154 


952 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins . 


BL00061B 25.79 6.S86^ 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PRO 030 8 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM01206 1 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.286e« 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PF01008 


initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01O08C 
12.25 5.333e-l8 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION" 


Results* 


970 


BL01277 


Ribonuclease PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL0U59 


ww/rspS/wwp domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PP00791C 20.98 2,23Se^ 
09 55-94 


978 


" BL0lifc7" 


Ribosomal protein LI 7 
proteins . 


BL01167B 20.64 8 . 258e~ — 
19 88-127 


979 


BL00478 


LIN domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312P 15.06 
5.865e-35 199-229 

PR00315H n 11 A iiin_ 
ri\uuJX«n IJ . JJ. O . jXjc 

35 263-291 PR00312J 

392 PR00312D 9.43 
2.63£e-33 128-mfl 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PP00992 


Troponin . 


PF00992A 1*.£7 8,816e- 
09 414-449 


982 


PR00299 


ALPHA CRYS TALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BLO1150A 
14.10 8.200e-39 100- 
136 


986 


BL00795 


Involucrin n^otf*iriQ 


14 4-49 BL00795C 
17.06 1.77fle-ll 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17,06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


BL00939 


Ribosomal protein Lie 
proteins . 


BliO0d3$F 17.27 g.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


BL00027 


•Horaeobox' domain 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL01304 


ubiH/COQ6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM017*7B 10.07 7.8^8e- 
09 22-39 


1600 


PR00926 


MITOCHONDRIAL CARRIER j 
PROTEIN SIGNATURE 


PR00926C 16.07 1.7S0e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR0O926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 



224 



WO 01/53312 PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOOe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


BL0Q406 


Actins proteins. 


BL00406b $.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B $.47 l.OOOe- 
40 88-143 BL0O4O6C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 246-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 


PRO03O4 


TAILLES.S COMPLEX 
! POLYPEPTIDE 1 

(CHAPE RONE) SIGNATURE 


PROO304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
PR00304B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NIX. 


PD01066 19.43 2.929e- 
32 9-48 j 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01O66 19.43 2.929e- 
32 68-107 


1012 


BLOOSlB 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIQASB 

PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.391e« 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphoglycerate mutase 
family phosphohistidine 
proteins . 


BL00175A 15.42 5.179e- " 

12 6-26 BL00175C 

23 .75' 8.062e-10 79-111 


1025 


PRO0305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BLO0183 


Uhi qui tin- conjugating 
enzymes proteins. 


BL00183 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF005B0A 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASB/EPOXIDE 
HYDROLASE FAMILY 

o -LvaPJ/i I U Ka 


PR00413E 15.78 3.429e- 
09 1S4-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD(Jl796^ l*.6i 4. 25^96- 
11 55-82 


103 9 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PK00970A 17.73 6.143e- 
20 56-78 PRO097OD 



225 



WO 01/53312 PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-l8 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-l5 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR00048 


C2H2-TYPE ZlNC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- I proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


1047 


BLQ1216 


ATP- citrate lyase / 
succinyl-CoA li gases 
family proteins. 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 7.618e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
40 12-62 


1054 


BL60571 


Amidases proteins. 


BL00571 25.^9 5.875e- 
31 160-212 


1055 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins . 


BL00030A 14.39 5.235e- 
11 98-117 BL0003OB 
7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-i4 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


1 Homeobox • domain 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins. 


BLO0455 13.31 6.211e- 
13 280-296 


1065 


PR00019 


LEUCINE - R I CH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


1066 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14* 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02B70 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 8.518e- 
11 164-197 


1072 


PF008S6 


SET domain proteins. 


PF00856A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 / Ag5 /PR- 1 /Sc 7 
proteins. 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXY PEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL0621SA lS.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.316e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BI.00460 


Glutathione peroxidases 
selenocysteine proteins . 


BL00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20. *7 3.6l7e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e~ 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PRO 0405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 53-85 


1116 


BL0035S 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.*28e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.S28e-2S 
20-51 


1120 


BL001Q7 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.857e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDROLASE 
SIGNATURE 


PftOO^F - 18.7* 9.S2*e- ■ 
12 301-324 


1125 


PR001B6 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1129 


BL00170 


Cyclophi 1 in - type 
pep t idyl -prolyl cis- 
trans isomerase 
signatur. 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BL0O636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins. 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 B.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14.53 1.28le-22 13-34 


1139 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.00Ge- 
19 451-482 BL00107B 
13.31 3.077e-12 519- 
535 


114 B 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR IIB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.4$3e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.47Se- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A £.<ie 3.475a- — 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.45Se- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


intermediate filaments 
proteins. 


BL00226B 23,86' 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 1 . 794e- ~ 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 . 
8.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.l00e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16\ 57 4.103e- 
18 1089-1113 


1185 


BL0021S 


Mitochondrial energy 
transfer proteins. 


Bt0021$A 15.82 4.5S3e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


by- 6 / u-PAR domain 
proteins . 


Bt00983C 12.69 2.761e- 
10 77-93 


1188 


BL00878 


orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
Si . 


BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1191 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PRU0345 


STATHMIN FAMltY 
SIGNATURE 


PR0034SB 7.12 2.8d0e- 
28 72-101 PR00345B 
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SEQ ID MO: 


""Accession 
NO. 


DESCRIPTION 


RESULTS* 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR0034SD 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-16 43-62 


1194 


""TR00345 


STATHMXN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 10e-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.964e-24 161- 
185 PR00345A 13.46 j 
5,645e-16 79-9B 


1195 


PF00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00932 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL009B2A 18.41 6.73Be- 
11 15-47 


1197 


BL01298 


D i hydrodipi col ina t e 
reductase proteins. 


BL01298A 13.50 5.959c- 
09 51-73 


1203 


BL00061 


Short-chain 

dehydrogenases/reductase 
s family proteins . 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 $.3B6e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins. 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.29Se- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins. 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFCO023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PF00023B 
14.20 1.8l8e-09 45-55 


1212 


PR00048 i 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7. 750e : 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR00450 


RE COVER IN FAMILY 
SIGNATURE 


PR004*bC 12.22 1.72'be- 
10 20-42 PR00450C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP-43J 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.0$ 5.348e- 
11 249-264 


1222 


PO00066 


PROTEIN ZINC-FINGER 
METAL- BINDI. 


PD00066 13.92 7.231e~ 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114* 
168 BL00437C 21.86 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


"BL61160 


Kinesin light chain 
repeat proteins. 


BL01160B 14.54 8.297e- 
10 S-60 | 


1231 


" PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- " 
09 391-405 


1232 


PR00497 


' NEUTROPHIL CYTOSOL 

FACTOR P40 SIGNATURE 


PR00497A 6.92 5.S53e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


Pfe604^7A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL60027 


'Homeobox 1 domain 
proteins. 


BL00027 26\43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.600e-10 
183-196 


1254 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL0111S 


OTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphor ibosylglycinamid 
e formyl transferase 
proteins. 


BL00373C 10.35^ 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-UKE 
SIGNATURE 


PR00011B 13.08 3.217e- 
10 174-193 


1259 


DL00518 


Zinc finger, C3IIC4 type 
(RING finger) , proteins . 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PRO 00 70 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 5.500e- 
12 16-27 | 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-163 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-typc, 'helix- loop- 
helix' diraerization 
domain proteins. 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL0U15 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN VS/TPX-1 FAMILY 
SIGNATURE 


PR00B37C 17.21 2.714e- 
18 165-162 PR00637A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 
12 201-215 




PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- " 
22 40-63 PR00449B 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


channel forming colic ins 
proteins . 


BL00276A 8.87 l.SOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PR00412 


EPOXIDE HYDROLASE j 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 9.53Be- 
10 127-157 


1279 


BLQ0134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Phosphatidylethanolamine 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


bhOOSIB 12.23 2.28Se- 
10 33-42 


1287 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PRO 08 02 


SERUM ALBUMltf FAMILY 
SIGNATURE 


PR00802B 16.51 1.610e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.695e~ 
09 23-44 


"1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 


1301 


BL0012? 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3 .571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL002iS 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.6B2e- 
09 552-572 


"1309 


PD00301 


protein Repeat muscle 

CALCIUM-BI . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins. 


BL00194 12. 1<5 1.900e- 
11 15-28 


1314 


BLG0594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- '" 
13 128-145 


1320 


BL00783 


Ribosomal protein 1*13 
proteins . 


BL00783C 22.43 6.559e- 
24 07-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 




FFD0514 


Armadi 1 i o/be ta-catenin- 
like repeat proteins. 


PF00514A 3l.30 7.268e- 
11 82-120 


1329 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYT0S0L 
FACTOR P40 fi TGHATCIRH 


PR00497A 6.92 ^..239e- " 
09 25-43 


1332 


PR00161 


NICKEL- DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


fR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- " 
33 10-49 


"133^1 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


fK00700D 12.47 2.200e- 
09 262-281 


133 7 


PR00700 


PROTEIN TYROSINE 


FR00700D 12.47 2.200e- 
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SEO ID NO: 


ACCESSION 
NO. 


PHOSPHATASE SIGNATURE 


RESULTS* 


1340 
1341 


PR00860 


ME TALLOTH I ONE IN 
SIGNATURE 


09 211-230 

PR00860A 5.46 5,034e- " 
13 5-18 




BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


bir repeat proteins. 


BL012B2B 3b. 44 £.974e- " 
21 383-422 


1344 


DM00099 


4 Jew A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE. 


DM00099B 14.73 8.313e- 
09 417-427 


1345 


BL00923 


Aspartate and giutamate 
racemases proteins. 


BL00923B 11.41 S.935e- ~ 
10 135-146 


1348 


PFOOfcjSl 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- ' 
13 44-57 


1350 

1352 


PR00193 


"lOYOSIN HEAVY CHAIN 

SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.3l8e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1353 " 




NATURAL RESISTANCE - 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 l.S54e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 9.077e- 
10 353-373 


r 


OiiUUJ UJ 


fc-ioo/icaap type calcium 
binding protein. 


BI.00303A 21.7V *.667e- ' 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL0003 9D 21.67 5.950e- ' 
29 375-421 BL00039A 
16.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


regulator ot G protein 
signalling domain 
proteins. 


PF00615B It). 25 2.216e- "' 
12 B4-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 
"1361 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD010S6 19.43 9.234b- 
29 10-49 




PRb092S 


NONHISTONE CHROMOSOMAL 
fXUictN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR0092SB 
3,73 6.143B-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PRD0925D 
6,56 1.857e-10 76-87 


1362 


BL01272 


GlucoJcinase reguiatory 
protein family proteins. 


BL01272B 19.t>l 6.870e-~ 
30 136-171 BL01272C 
11.60 3.3146-25 249- 
274 BL01272A 6.49 
1.23le-18 99-117 


1363 
"1364 


BL01272 


GlucoJcinase regulatory 
protein family proteins. 


BL01272B 19.61 6. 870e- 
30 113-148 BL01272C 
11.68 3.314e-2S 226- 
251 BL01272A 6.49 
1.231e-l8 76-94 




DM00179 i 


tf~KINASE ALPHA ADHESION 
r-CELL. 


JM00179 13.97 5.304e- 
39 167-177 


1368 "; 
3370 j 


^K001^9 ] 
I 

fR00988 1 


POTASSIUM CHANNEL "l 
SIGNATURE ( 
JRIDINK KINASE SIGNATURE j 


?R00169A 1^.77 1.592e- 
39 76-96 

'R00988A 6.3TT7754e^ 1 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8.615e- 
09 469-479 


1372 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR0062SA 
12.84 I.391e-16 14-34 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins.' 


BL00434C 23.85 3.778e- 
09 90-130 


1374 


PR00$*2 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE. 


PD02475A 23.18 B.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZJWC FiNGER ' " 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


13 80 


BL00194 


Thioredoxin family 
proteins. 


BLC0194 12.1* 8.333e- 
12 48-61 


1381 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 6.203e- 
10 95-132 


1386 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 l.OOOe- j 
11 52-61 | 


1389 


PD 01066 


PROTEIN ZINC FINGER 
ZINC-FINGER KETAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL - 
BINDING NU. 


PD010** 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR003BO 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13:18 6.538e-16 243- 
262 


"7394 


PD00066 


PROTEIN ZINd- FINGER 
METAL -BINDI . 


PD00066 13.92 3.406e- " 
14 462-475 PD00066 
13.92 8.800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 • 
PD00066 13.92 6.087e- 
11 490-S03 PD00066 
13.92 B.043e-ll 320- 
333 




1398 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.78*c- 
32 10-49 




1400 


DM6120* " 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 




"1406 


JrUUG930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 




1407 * * 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 7.500e- " 
10 457-476 




1408 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PRQ0019A 11.19 9.550e- " 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SsiQ Lu SiO; 


| ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








09 176-190 


1409 


PR0Q510 


" NEBULIN SIGNATURE 


PRO051OA 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9,21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00076B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Rioosoraal protein L5 
proteins . 


"BL00358B 22.7<J l.OOCe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.93le- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


6L00023 


Type II iibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00681 


RXBOSGMAL PROTEIN Si 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


1418 


DM00973 


3 Jew RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 l,462e- 
09 171-208 


1419 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 

CO TRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 2B.52 
5.382e-15 1038-1093 


1422 


PR0026S 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 t>.318e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteina profile. 


BL5O0O2A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.4626-11 206-227 
BL50002B 15.18 l.OOOe- 
09 244-256 


1425 


PF00628 


PHD- finger . 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PP00628 


PHD-finger. 


PF00628 15.84 3.0456- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 S.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00Q39 


ucinu wua auciaiMily Alt'- 

dependent helicases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


1429 


FR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR0O378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.6506-10 166- 
186 


1431 


PR00928 


GRAVES DISEASE CARRIER " 


PR00928B 13.53 3 . 769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.0OOe-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL0029Q 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.50Oe- " 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00806 


VINOJLIN SIGNATURE 


PR00806B 4.28 4.96 0e- 
09 38-52 


1441 


PR008D6 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01841 


PHOS PHORYLAS E KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 1.000e-40 144- 
185 PD01841D 17.87 
1.000e-40 206-2S8 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 1.000e-40 349- 
403 PD01841I 23.00 
1.000e-40 494-536 
PD01841J 14.94 l.OOOe- 
40 B95-932 PD01841L 
18.42 1.000e-40 1083- 
1125 PD01B41B 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01B41M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS his tone -family. 


PF00816B 13.84 8.875e- 1 
09 190-220 


1447 


PR00048 


C2H2-TYPB ZINC FINGER 
SIGNATURE 


PR00048A 10.52 S.OSOe- 
09 402-416 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region rnp-i proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DMOlfiSB 


2 POLY-IG RECEPTOR. ' " 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins. 


BL00545C 11.28 7.353e- 
17 169-182 BLO0545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR0Q097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR60097C 9.42 9.0<i9e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- ] 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins. 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 


1477 


PF00566 


Probable rabGAP domain 
proteins . 


PF00566A 12.64 7.333e- 
10 466-476 


147B 


BL00030 


Eukaryo tic RNA- binding 
region RNP-l proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM0O406 


GLIADIN , 


DM00406 7.73 8.541e-10 
292-305 


1480 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20. B$ 5.091e-ll 12-35 


1481 


PR00150 


PHOSPHOENOLPYRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF00780 


Domain found in NIX1- 
like kinases , mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kmesin light chain 
repeat proteins. 


BL01160B 19.54 1.153e- 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1466 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins. 


BL00452D 28.59 3 . 700e- 
31 63-106 DL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
11 384-400 BL0010*7A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 " 


•Homeobox* domain 
proteins . 


BL00027 2^.43 4.789e- "~~ 
24 112-155 


1503 


BL00027 


' Homeobox ■ domain 
proteins. 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphy la toxin domain 
proteins . 


BL01177B 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 l,900e- 
15 427-445 


1506 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22. 5S S.SOOe- " 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972B 20.72 8.7S9e- 
10 341-363 


1512 


BL00523 


sulfatases proteins. 


BL00523B 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








331 BL006C0G 12.43 
9.625e-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.040e-12 190- 
206 BL006COF 8.77 j 
1.000e-ll 343-356 
BLO0600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN 6TPASE DOMAIN 
ACTIVATION. 


PO00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-2B7 
PR00320C 13.01 8.800e- 
09 106-121 


1538 


DM01970 


0 Jew ZK632.12 YDR313C 
BNDOSOMAL III. 


DM01970B 0.60 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF007B1D 11.11 7.593e- 
10 103-127 


Ts^o 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR0096SC 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 l.OOOe-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 1 . 000e- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxys terol - binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqui tin-activating 
enzyme proteins. 


BL00536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAGINASE/GLUTAMINASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


1?R00049D 0.00 S.ll9e- 
09 58-73 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BL00061 


Short -chain 

dehydrogenases/reductase 
o family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.105e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.10Se- 
12 107-132 ! 


1559 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 6.600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6,000e-16 279-326 
BL00522E 19.63 6.l23e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BLO0299 28.84 2.823e- 
10 324-376 


"1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 8.594e- 
17 184-228 BL01013C 
9.97 4.906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3 .400e-10 
378-389 BL00678 9.67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


1576 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR006c?5 


OCYTOCIN RECEPTOR 
SIGNATURE 

• 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-3S4 PRO0665C ' 
5.89 l.OOOe-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE ™ - 
TERMINAL 

DIHYDROPTBRIDINE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL0052'4 


Somatomedin B domain 
proteins. 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02B94 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.225e-10 57-103 


1581 


BL004U 


Kinesin motor domain 
proteins. 


BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1585 


DM01551 


lew OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


UM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM013S4S 11. *1 7.750e- 
09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955e- 
33 180-210 PR00072A 
12.75 6.04Oe-25 120- 
145 PR00072C 11.42 
2.2B6e-24 216-239 
PR0DO72D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENUOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME. 


DM00517B 10.96 6.625e- 
16 1175-1191 nM00m*7A 
8.21 1.000e-ll 1015- 
1026 


1S92 


BL00037 

• 


Myb DNA-binding domain 
proteins repeat proteins 
proteins. 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PF00628 


"PHD-finger. 


PF00628 15.84 3.250e~ 
11 1667-1682 


"1599 


PRO 0014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


"isoo 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


1607 " 


BL00252 


Interferon alpha, beta 
and delta family 
proteins. 


BL00252A 18.49 £.^57e- 
23 20-57 BL00252B 
19. 7B 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


"1*11 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168- 


C2 domain proteins. 


PF00168C 27.49 3.250e- 
09 365-391 


1613 


BL00412 


Neuromodul in (GAP -43) 
proteins. 


BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.1S3e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL00559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANS FERASE BI. 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.485e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BLQ0303 


S-100/ICaBP type calcium 
binding protein. 


BLOO303B 26.15 7. 75*06- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


" BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD018B8 


PEPTIDE REDUCTASE 
PROTEIN METHI . 


PD01888B 25.10 l.OOOe- ' 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN c- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239B 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e~09 703-715 


1622 


PR00860 


VERTEBRATE 
METALLOTH I ONE IN 
SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00734 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8 . 027e- 
11 77-95 


1626 


BL00325 


Ac t in-depolymeri zing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BLQ0325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins . 


BL00064B 23.57 l.OOOe- 
40 82-130 3LO0064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
1.000e-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 1.105e- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL011B3 


ubiE/COQ5 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
rR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00320A 16.74 2.u98e- 
09 229-244 


1642 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BIj0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 




PR00380 


KINESIN HEAVY CiiAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6.308e-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


1647 


DM01242 


3 THREONINE- -TRNA 
LIGASE . 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 B.054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Klnesin light chain 
repeat proteins. 


BL01160B 19.54 6 . 720e- 
11 431-485 


1652 


BL00933 


FGGY family o£ 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BLO0795 


Involucrin proteins. 


BL00795C 17.06 2.98Be- 
10 70-115 


lfi"54 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 302-334 


1655 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7 . 750e- 
17 282-314 


1656 


BL00741 

- 


Guanine - nucl eo t i de 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391e- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BL06972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 4.140e- 
12 376-401 BL00972B 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Actins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1661 ~ 


PR00105 


CYTOSINE-SPECIFIC DNA 

METHYLTRANSFERASE 

SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BLooSao 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.6"i 3.17ie- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 

( TRANS DUC IN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-l9 70-85 
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SEQ ID NO:" 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 




BL00016 


EF-hand calcium-binding 
domain oroteins. 


OiJ V WW Jm w / • JL 3 t v O vt?**JLU 

489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 

ZINC-FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.500e- 

38 7-46 


1669 


BL01153 


NOLl/NOP27sun family 
proteins . 


BL01153D 19.69"l".i88e- 
17 115-141 BL011S3C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- " 
10 1146-1169 


1672 


BL00598 


Chrorao domain proteins. 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- " 
11 343-358 PRD0O49D 
0.00 1.286e-10 342-357 


1676 


PR00747 

1 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR0074 7H 12.76 B.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.5Q0e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
1B3 PR00747E 15.13 
8.244e-15 254-272 
PR0O747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PRO 074 7 


GLYCOSYL HYDROLAfife 
FAMILY 47 SIGNATURE 


f KUU / 1 fti 12.7b B.oJbe- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 

7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747R 
7.65 5.355e-13 75-90 
PR00747F 13. S6 8.714e- 
10 193-210 


1680 


BL0067B 


Trp-Asp <WDJ repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL0067e 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp-Asp tWDJ repeat 
proteins proteins. 


BL0067B 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PRO 03 2 6 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PR0064C) 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


&RC6646H 6". 32 4,188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL0116 0B 19.54 6.644e- 
09 75-129 


1*91 " 


PR00456 | 


RIBOSOMAL PROTEIN* P2 
SIGNATURE 


PR0.O4**fif? *\ nf£ 7 noi*. 
rnvuiSDCi J.uo f.zcJJLe- 

10 418-433 PR00456E 

3.06 7.281e-10 419-434 

PR00456E 3.06 6.125e- 

10 420-435 


1692 


PR00456" 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 


AAA-protein family ! 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO, 


DESCRIPTION 


RESULTS* 








4.45 4.000e-23 241-2£3 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-24S HEAVY 
CHAIN SIGNATURE 


PR004^C 10.17 3.443e- 
13 187-208 PR00466B 
S.03 5.500e-ll 162-186 
PR00466F 9.16 6.l59e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00026 16.07 
S.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BL00028 
16.07 l.S00e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR0D109 


tyro"sine~ KINASE 

CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.55Be- 
14 134-153 


1710 


PR00019 


LBUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PRO0O19B 11.36 
7.120e-09 204-218 


1711 


Bt01159 


WW/rsp5/WWP domain 
proteins. 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A lfcl.03 7.0006- 
10 187-203 


17L3 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BL00038 


Myc-type, 'helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 41B-428 


1724 


BL01279 


Protein-L- 

isoaspartate(D- 

aspartate) 0- 

methyl transferase signa . 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calcium -binding 
domain proteins. 


BtdOOlB 1At ti.059e-ll 
73-86 * BL0OO18 7.41 
4.176a-ll 157-170 


1730 


1&00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO:" 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesin light chain 

Tf*npat nTn^p ire 


BL01160B 1$.S4 9.6 , 7b'e- 


1732 


BL0ll6"6 


Kinesin light chain 
repeat proteins. 


BLO1160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


family . 


rruuoauf A3, fu 4.349e- 
22 246-279 PF00850D 
14 76 6 ARflta-1rt I*?*?- 

J.** • /O O ,ODUc" ZU XI / — 

201 PF00850E 8.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL003 54 


HMG-I and HMfi-Y DMJX- 

binding domain proteins 
(Ahook) . 


dLiUU^34u o.ol o.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.263e- 
10 492-502 


1743 


D17 n n A A Q 


IKANor UKMIMLj PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 




Guanine -nucleotide 
dissociation stimulators 
\jj\0»£o iamij.y sign. 


BL00720B 16.57 8.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR000B1B 10.38 6\727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acyl transferases 
ChoActase / COT / CPT 
family proteins . 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 


PR00819B 10.83 7.15Be- 
11 4-20 


1751 


PD00066 


PROTEltf ZlNC- FINGER 
METAL -BINDI. 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
AJ oA-74 PDQUQoo 

13.92 6.571e-12 117- 
130 


1753 


BIi01013 

OUV 1V/1J 


yAy b l c coi - oinuing 
protein family proteins. 


aLtUXVxlu 26.81 o.516e- 
18 33-77 


1754 


BL00790 


RecfiDtor hvrnninp tri riaop 

class V proteins. 


09 490-521 BL00790I 

BL00790I 20.01 6.357e- 
09 287-3lft 

W J ** O 9 J «L O 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER M3TAL- 
BINDING NU. 


PD01066 19.43 9.750e- '" 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653 -666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PDOi^A 28.2*) 4.529e- 
09 224-276 


176S 


PR00326* 


GTP1/0BG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 1 


AnJc repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 

a ecorba te - dependent 

nionooxygenases proteins . 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BL01013 


protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.831e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3 . 717e- 
12 409-420 


1783 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p-value; postion of 
signature in amino acid sequence. 
TRADOCS: 14 16223.1 (%CRJ0l I.DOC) 
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TABLE 4 



SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p— value 


PrAM 


2 




Immunoglobulin domain 


2.1e-32 


109 . 5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


"l.3e-29 


" ' 110 7 


4 


ZI-C2H2 


Zinc finger, C2H2 type 


1.6e-21 


84 .9 


5 




Fibronectm type III domain 


0 


1097 . 1 




- £n3 


Fibronectin type III domain 


0 


1035.0 


7 


fn3 


Fibronectin type III domain 


o 




8 


fn3 


Fibronectm type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4<}-40 


14 6,7 


io 


p4 50 


Cytochrome P450 


9.5e-l7 


62.0 


12 


ank 


Ank repeat 


6e**20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


15 


zi-MYND 


" MYND fincer 


1 . 3e-06 


35.4 


1<J 


zf-MYND 


"MYND ringer 


1 . 3e-06 


35.4 


17 


zf-C2H2 


ox4iu i. iu*jCL , \-£tl£ ^ype 


1 . 7e-99 


343.9 


18 


CAP GLY 


CAP-Gly domain 


1.2e-25 


98.7 


20 


IMPDH_C 


xi'if aenyaxogenase / GMP 
reductase C terminus 


1. 6e-119 


410.5 


"21 


IMPDH C 


xvitf aenyaxogenase / GMP 
teauccaae u cermiuus 


4 .3e-102 


352.6 


22 


pkinase 


cujtairyoti c procein Kinase 

dftmn ^ ry 


2 ,4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 


8.46-74 


"258.6 


25 


RNA_jpol A 


«>"rt ^uxymci <is e axpna suDumt 


0 


1077.7 


26 


Clq 




1 . 9e-10 


44 .4 


27 


Ribosomal h2 

3 




7 . 8e-32 


111.2 


28 


Ribosomal 1*2 
3 




le- 29 


104 .2 


30 


Z1-A20 


A2 0-1 ike zinc finger 


1 » 5e- 10 


. ____ 

48 . 5 


31 


zf-A20 


A20-like zinc finger 


I.5e-10 


48.5 


"32 


FMN_dh 


FMN- dependent dehydrogenase 


5 . 4e-179 


608 . 1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3.8e-59 


209.9 


35 




Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


Immunocrlobul in domain """ 


1 . 4e-13 


48.8 


40 


kinesin 


Kinesin niofctiT* Hftma i n 


6 . 7e-76> 


265 .6 


44 


Ets 


Ets-domain 


1 . 4e-56 


182 .1 


45 


Eta 


Bts-domain 


1.4e-56 


182.1 


46 


LRR ' " ~ 


Leucine Rich Rpntaat- 


1 . 7e-l3 


58 .3 


48 


zf-C2H2 




i. . je-io-i 


552.8 


49 


IT AM 


Immunorecentor tivrosino - haopH 
activation mot 


A,4e-u» 


31 . 9 


50 


UCK-2 


Ubiquitin car boxy 1- terminal 
hydrolase family 


1 . le-26" 




51 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 1 


102.0 


52 


ras 


Ras family 


8.5e-45 


162.3 


53 


PRK 


Pho sphor ibu 1 oki n a se 


2 . le-65 


230 . 7 


54 


myJb_DNA- 
bindlng 


Myb-like DNA- binding domain 


0 . 096 




bb 


voltage_CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


©ugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


57 


TBC ~ 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat ~ Jr " 


5.9e-2S 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 ■ ' 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7.9e-54 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194.0 


""76 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 


ig 


Immunoglobulin domain 


8.2e-2B 


94.7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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SEQ ID 
NO: 


PFAM NAME 


~ DESCRIPTION 


p-value 


PFAM 
SCORE 






domain 






74 
nd 


pkinase 


Eukaryotic protein kinase 
domain 


2.8e-38 


140.6 


t o 


ZC- 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5.4e-54 


192.8 


PJ 


Pept i da se_S 9 


Prolyl oligopeptide se family 


4 .3e-10 


36.8 ~ 


84 


fn3 


Kioronectin type III domain 


4.1e-51 


183 .2 


86 


SH2 


Src homology domain 2 


3.1e-22 


67.7 


88 


ig 


Immunoglobulin domain 


0.0091 


14.0 


Q9 


WD40 


WD domain, G-beta repeat 


2.le-2l 


84.6 


92 


laminin G 


Ijaminin G domain 


6.1e-27 


98.5 


93 


AMP-binding 


AMP-binding enzyme 


2.4e-13 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


" 2.6e-51 


" 183.9 


97 


adh short 


short chain dehydrogenase 


2e-61 


217.5 


98 


kinesin 


Kinesin motor domain 


1 2.2e-86 


300.4 ~~ 


101 


IRS 


PTB domain (IRS-1 type) 


5.4e-36 


133.0 


102 


AAA 


ATPases associated with various 
cellular act 


6. Be-05 


' -5.2 


104 


pkinase 


Eukaryotic protein kinase 
domain 


2.7e-73 


256.9 


106 


ras 


Ras family 


8.3e-24 


92.5 


107 


FYVE 


FYVE zinc finger 


S.4e-27 


100.7 


108 


Cyt_reductas 
e 


fad/nad- binding Cytochrome 
reductase 


7.7e-6l 


215 .5 


109 


z£-C2H2 


Zinc finger, C2H2 type 


2.3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 - 


116 


PH 


PH domain 


3.1e-ll 


45.2 


TTf 


lipocalin 


Iiipocalin / cytosolic fatty- 
acid binding pr 


2 ,4e-14 


53 .5 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4 ,5e-20 


76.3 


120 


WD40 


WD domain, G-beta repeat 


2.4e-14 


61.1 


121 


WD40 


WD domain, G-beta repeat 


2.4e-14 


61.1 


123 


IF5_eIF4_eIF 
2 


el F4 -gamma/ eIF5/e I F2 -eps i 1 on 


le-32 


122.2 


124 




Immunoglobulin domain 


6.5e-08 i 


30.6 


127 


mito_carr 


Mitochondrial carrier proteins 


3e-l6 


58.6 ~~ 


128 


PP2C 


Protein phosphatase 2C 


2.2e-71 


250.6 


129 


ATP1G1_PLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


3 .le-20 


80.6 


130 


pfkB 


pfKB family carbohydrate kinase 


4.5e-42 


137.1 


133 


ACBP 


Acyl CoA binding protein 


4.6e-22 


86. 7 


134 


rrm 


RNA recognition motif. 


l.2e-3i 


118.5 


135 


IQ 


IQ calmodulin- binding motif 


2.6e-08 


41.0 | 


136 


ATP1G1__PLM_M 
AT8 


ATP1G1/PLM7MAT8 family 


9.3e-22 


85.7 


~139 


"WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.0067 


23.1 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-82 | 


287.5 


141 


Peptidase S2 
6 


signal peptidase I 


s.7e-io 


3*. 7 


143 


arf 


ADP-ribosylation factor family ' 


1.2e-39 


145.2 


146 


KRAB 


KRAB box 


7.3e-30 


112.6 


148 


DUFti 


Integral membrane protein DUF6 


0.096 


8.0 


149 


PDEase 


3 • 5 ■ -cyclic nucleotide 
phosphodiesterase 


3.8S-80 


231.1 


151 


S4 


£4 domain 


l.le-08 


42.3 


"153 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3.8e-103 


356.1 


154 


Cyt_reductas 
e 


FAD/NAD-binding Cytochrome 
reductase 


7.8e-60 


212.2 


\2.5S 


ras 


was family 


3.6e-28 


107.0 


"157 


actin 


Actin 


3.8e-26 1 


37.1 
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SEQ tD ' 
NO: 


PPAM NAMF 


DESCRIPTION 


p— value 


np&M 
SCORE 


158 


Jacalin 


Jacal in-like lectin domain 


0.09 


-24.9 


160 


Zn ca rbOpep t 


Zinc carboxypeptldase 


5e-138 


471 . 9 


165 


pkinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236.1 


167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27 . 0 


168 


Ribosomal_Sl 
5 


Ribosomal protein S15 


l.le-06 


29 . 0 


1*9 " 


DEAD 


DEAD/DEAH box helicase 


le-48 


157 .0 


171 


DUF59 


Domain of unknown function 
DUF59 


0. 07 


-17 .4 


172 


pJcinase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58.6 


173 


globin 


Globin 


4 .6e-18 


67 .4 


174 


WW 


WW domain 


7.3e-06 


32.9 


175 


ras 


Ras family 


le-31 


118 . 8 


178 


ATPlGl PLM M 
ATS 


ATP1G1/PLM/MAT8 family 


2.5e-17 


71.0 




zf -C2H2 


Zinc finger , C2H2 type 


1 . 5e-99 


344 . 2 


180 




Ciq domain 


8. 8e-72 


251 . 9 


190 


Yjrtiosphatas 


Protein- tyrosine phosphatase 


4.9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


1 Q*3 




jiuivurygtiC protein Mnase 

domain 




285 . 6 


194 






5 8e-31 


111 , 4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.Se-*4 


227 .1 


197 


DnaJ 1 


DnaJ domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


U . UU Oao 


<t ft q 


Z uU 


acid phospha 

*> 
c 


riistiuine acid pnospnatase 


2 , 5e-10 


%n ^ 
J / . z 


903 


via a 


wioKott Aiuticn oyiiUiOuic 
noinoiu^y teflon a 


n noo48 

v • UUU4D 


26 . 9 


2 04 


vATP— 
avnt AC39 


ATP svnthase fC/AC39l subunit 
nir sjiiuuasc iv>/nw^( auuuitib 


1 . 3e-159 


543 .7 


205 


vATP— 
synt_AC39 


ATP synthase (C/AC39) subunit 


1. 6*6-1^9 


476.9 


20£ 


ldl recept a 


Low-density lipoprotein 
receptor domain 


2 ,4e-25 


97 .6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


211 


Clef 


Cl<j domain 


1.6e-70 


247.7 


212 


UQ con 


Ubi qui tin -conjugating enzyme 


7.4e-74 


258.8 


213 


UQ_con 


Ubiquitin-conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/ DE AH box helicase 


l.8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83.4 


218 1 


Glycos trans 
f 2 


Glycosyl transferases 


4e-21 


B3 .6 


219 


is 


Immunoglobulin domain 


0.092 


10.7 


222 


WD4 0 


WD domain, G-beta repeat 


7.4e-23 


89.4 


224 


TPR 


TPR Domain 


1.2e-08 


42.1 


225 


DnaJ CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


l.Se-38 


141.5 


229 


HSP70 


Hsp70 protein 


2.4e-54 


194.0 


230 


GSKPx 


Glutathione peroxidases 


3.4e-47 


170.2 


231 


tsp_l 


Thrombospondln type 1 domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4.6e-144 


492.0 


234 


ras 


Ras family 


4.8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 


LRR ' ; 


Leucine Rich Repeat 


6.7G-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


'i.7e-69 


45.0 
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SEQ ID 

NO: 


PFAM NAM3 


DESCRIPTION 


p— value 


PFAM 
SCORB 


244 


dCMP_cyt_dea 
m 


Cytidine and deoxycytidylate 
deaminase 


2.5e-0$ 


31 .1 


24S 




Immunoglobulin domain 


6.7e-08 


30.5 


248 


wnt 


wnt family of developmental 
signaling protei 


9.1e-270 


742 . 6 


250 


mito_carr 


Mitochondrial carrier proteins 


1.3e-5$ 


193 . 4 


'254 


adenylatekin 
as© 


Adenylate kinase 


1 . 8e-l4 


55.7 


255 


Cation__efflu 

X 


Cation efflux family 


2 . 8e-33 


124 .0 


256 


SH3 


SH3 domain 


3.9e-14 


60.4 


"257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2 ,6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380.2 


"259 


HIT 


KIT family 


8 . 2e-07 


25.3 


260 


Bacterial PQ 
Q 


PQQ enzyme repeat 


l.tfe-15 


65.0 


262 


proteasome 


Proteasome A- type and B- type 


6 . 5e-64 


225 . 7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 . 3e-27 


101 . 0 


270 


filament 


Intermediate filament proteins 


3 . 2e-150 


512 . 5 


271 


Choline_kina 
se 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 ,3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 3e-77 


o<J o 6 


280 


WD4 0 


WD domain, G-beta repeat 


7 . 8e-73 


255 * 4 


281 


WD4 0 


WD domain, G-beta repeat 


7 . 8e-73 


255 . 4 


284 


zf-DHHC 


DHHC zinc finger domain 


4 . 6e-24 


93 . 4 


287 


Exonuclease 


Exonuclease 


1.4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11 . 2 


294 


Zf-C2H2 


Zinc finger, C2H2 type 


1.4e-29 


111.7 


295 


*f-daHi 


Zinc finger, C2H2 type 


2.2e-125 


430.0 


"29* 


raito carr 


Mitochondrial carrier proteins 


4 . le-59 


205 . 5 


297 


HMG_box 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos trans 
£ 4 


Glycosyl transferase 


5e-87 


302.5"" 


304" — 


"tfeNA-synt 2 


tRNA synthetases class II (D, K 
and N) 


1 .le-84 


294 . 8 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2 .7e-44 


160.6 ! 


308 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


5.2e-39 


126". 1 


'309 


DNA_polymera 
seX 


DNA polymerase X family 


2.4e-64 


227.2" 


311 


F-box 


F-box domain. 


9.5e-08 


39.2 


312 


ig 


Immunoglobulin domain 


6.8e-19 


65.9 [ 


313 


Ets 


Ets -domain 


8.1e-60 


192.3 


315 


Kelch 


Kelch motif 


1.3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family 


3.2e-35 


130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.0003 




320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288 . 6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282.6 


324 


Xlink 


Extracellular link domain 


4.5e-143 


331.5 


32* 


ARID 


ARID DNA binding domain j 


S.le-37 


136.4 


327 


HMGJbox 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


6.1e-81 


281.9 


331 


chromo 


' cnromo 1 ( CHRroma t in 
Organization Modifier) 


46-18 


6^.7 


333 — 


Peptidase_M2 
2 


Glycopro tease family 


1.2e-135 


4*7.4 
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SEQ ID 
NO: 


" PFAM NAME 


DESCRIPTION 


p —value 


PFAM 
SCORE 


335 


vwa 


von Willebrand factor type A 
domain 


2 » 3e-07 


1 1 o 
J f . j 


339 


ras 


Ras family 


7 . 8e-07 


-3 9 '.1 


340 


zf-C2H2 


Zinc finger, C2H2 type 


8 . 2e-64 


225 . 4 


342 


z£-C2H2 


Zinc finger, C2H2 type 


2 . 4e_B5 


297 . 0 


343 




Immunoglobulin domain 


0 .0005 


18. 0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6 . 5e-65 


229 .1 


347 


" pkinase ~ 


"Eukaryotic protein kinase 
domain 


6.5e-65 


"229.1 " 


3*1 


egf 


EGF-like domain 


8.5e-20 


79.2 


352 


auk 


Ank repeat 


" 2.5e-10l 


" 350.0 


354 


TBC 


TBC domain 


5 . ie-15 


""63.3 


355 


PHD 


PHD- finger 


3 . 2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0.033 


15 . 8 


359 


zf-C2H2 


Zinc finger, C2H2 type 


7 . 4e-20 




361 


ank 


Ank repeat 


" 6.6e-34 


126 .1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 . 73-53 


18 $'7l 


363 


efhand 


EF hand 


5 . 4e-10 


AH K 

TV tO 


367 


LRR . 


Leucine Rich Repeat 


8 . 8e-44 


158 9 


368 


iaminin G 


Laminin G domain 


1.5e-33 


121.7 


369 


PP2C 


Protein phosphatase 2C 


5 . 3e- 20 




372 


Lin 


LIM domain containing proteins 


•S . JC" J.S 


57.1 


373 


KRAB 


KRAB box 


A flo _ O "l 
. OS' £j 


90 .0 • 


376 


ion_trans 


Ion transport protein 




-4.2 


377 


Beach — — 






704 . 5 


380 


pkinase 


Eukarvotic Drotein Icinanp 
domain 


i. . oe- 34 


327.5 


381 


AMP-binding 


AMP-binding enzyme 


1 .4e-07 


_ 1 A. ft 1 ' 


382 


HECT 


HECT-domain (ubiquitin- 
transf erase) . 


1.3e-07 


-13.5 


384 


ank 


Ank repeat 


* .Do" i.v X 


350.0 


386 


ig 


Immunoglobulin domain 


-* • 3C"UO 


23.6 


388 


zf-C2H2 


Zinc f inger , C2H2 t yue 


1 . 7e-42 




389 


ig 


Immunoglobulin domain 


2 . 8e-15 


"V7 — =5 


390 


mitq_carr 


Mitochondrial carrier proteins 


3.5e-6? 




3§2 


TPR "~ ' 


TPR Domain 


6 . le-17 


fi<> n ' 

D9 . / 


393 


SH3 


SH3 domain 


3 . 5e-09 


. y 


394 


AAA 


ATPases associated with various 
cellular act 


4 .le-21 


83 . 6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


237.3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0.0066 


23 . 1 


399 


fn3 


Fibronectin type III domain 


4 .le-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0 . 00049 


26 . 8 


401 


El dehydrog 


Dehydrogenase El component 


-3e-119 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719 . 6 


404 


LRR 


Leucine Rich Repeat 


2 .le-io 


48 . 0 


405 


cadherin 


Cadherin domain 


8 . ie-8l 


281 . 9 


406 


z£-CXXC 


CXXC zinc finger 


5e-l5 


63 . 4 


410 


RhoGEF 


RhoGEF domain 


l.le-23 


92 .1 


411 


F-box 


F-box domain. 


4.2e~06 


33.7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


5.8e-16 


61.6 


415 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3.8e-24 


93.6 


419 


DENN " 


DENN (AEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


S.le-43 


1*5.1 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G -patch 


G- patch domain 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 - 


alexin repea 
t 


Plexin repeat 


0.0023 


24.6 


427 




Plexin repeat 


0.0023 " 


24.6 
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SBQ ID" 
NO: 


PFAM NAME 

c 


DESCRIPTION" ~ 


p -value 


PFAM 
SCORE 


429 
431 


zt-C3HC4 
DEAD 


Zinc finger, C3HC4 type (RING " 
finger) 

dead/dkah box helicase — 


8.6e-n 


39.2 


432 


SH3 


SH3 doraain 


3.4e-i£ 


214.6 
67.2 " 


433 


GTP CDC 


ceil division protein 


2.1e-114 


393.5 


436 


Collagen 


collagen triple .nelix repeat 
(20 copies) 


4 .6e-194 


658.1 


438 


Ricin B lect 
in 


Similarity to lectin domain or 
ricin b 


0.0085 


10.5 


441 


Alpha adapt a 
n_C 


Alpha adaptin carboxyi -terminal 
domai 


1.2e-256 


" 866.0 " 


442 


Alpha adapti 
n_C 


Alpha adaptin carboxyi- terminal 
domai 


l":Be.235 " 


795.7 " 


443 


PD2 


PDZ domain (Also Known as DHR 
or GLGF) . 


1.9e-65 


230.9 


445 
446 


LON 

ig 


ATP-dependent protease La (lOu) 
domain 

immunoglobulin domain 


0.00012 
0.00011 


-17.1 
20.1 


,451 
"452 


sushi 
*Tn3 ~ 


Sushi domain (SCR repeat) 

Fibronectm type III domain 


1.4e-18 
1.5e-06 


75.2 

35.2 ~~ 


454 
456 


pyridoxal de 
C 

kinesin 


Pyridoxal -dependent 
decarboxylase conse 
Kinesin motor domain 


8 . 3e-14 




457 
458 


neur_chan 
Josephin 


Neurotransmitter- gated ion- 
channel 
Josephin 


4 .9e-217 
le-175 

0.0002 


734.4 
597.1 

18.7 


468 
470 

471 


bZIP 

NTP_transfer 

ase 

WD40 


bZIP transcription factor ~" 
Nucleotidyl transferase 

"wd domain, G-beta repeat 


1.7e-07 
6.3e-0£ 

2e-28 


31.8 
--2(5.3 " 

107.9 


473 
477 

479 


LIM 

zf-RanBP ~ *~ 
HD40 


LIM domain containing proteins — 
Zn- finger in Ran binding "~ 
protein and others. 
WD domain, G-beta repeat ~~ 


0.00021 
0.028 

*.5e-18 


20.7 
21.0 

73 . 0 


480 
481 

485 


kRAB 
ArfGap 

~SH2 


KRAB box 

Putative GTP -ase activating 

protein for Arf 

Src homology domain 2 


le-3l 
8 .4e-66 

0.011 


118 . 8 
232 . 0 

11.4 ~~ 


486 
487 

489 


Clq 
dsrm 

zf-C2H2 ~ 


Clq domain 

Double- stranded RNA binding 

uiOlII 

zinc finger, C2K2 type 


4.3e-74 

l.le-47 


259.6 
171.9 


490 


Alpha adapti 
n C 


Alpha adaptin carboxyl-terminal 
domai 


4.8e-lS3 
3.4e-222 


521.9 " 
75T7S 


492 


"sia 


shikimate kinase 


1.2e-10 


48.8 


497 


ENVjpolyprot 
ein 


BNV polyprotein (coat 
polyp rotein) 


2.6e-22 


77.6 


498 
500 


abhydxolase 
2 

rrm 


Phospholipase/Carboxylesterase 
RNA recognition motif. 


0.041 


-48.1 


501 " 
502 


WW ■ ' 
ig 


WW domain 

Immunoglobulin doraain 


5.4e-34 
4.6e-l8 
l.le-10 


126.4 

73.4 

39.5 


504 
505 


abhyarolase 
vwa 


alpha/beta hydrolase fold 
von wniebrand factor type A 
domain 


0.045 
7.le-62 


-3.£ 
219.0 


508 
509 


Na K ATPase 

c 

r»xonuciease 


Na+/K+ ATPase C- terminus 
Exonuclease 


2.3e-145 


496.3 


510 j 


Giycos trans 
f_l 


Glycosyl transferases group 1 


1.3e-£4 
2.9e-06 


201 . 5 
27.0 


511 


Glycos trans 


Glycosyl transferases group 1 


2.9e-0* 


27.0 


512 


Glycos trans < 
£ 1 


Slycosyl transferases group 1 


1.96-09 


38.5 


514 ] 


pro isomeras ( 
1 


^yclophilm type peptidyl- 
prolyl cia-tr 


i.8e-63 


221.4 - 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION " 1 


p-value 


PFAM 
SCORE 


515 


EGF 


EGF- like domain 


1.9e-18 


74 .7 


~S16 


Surp 


Surp module 


4.3e-38 


140.0 


523 


a-9 


Immunoglobulin domain 


3.3e-06 


25.0 


526 


UBX 


UBX domain 


l.le-34 


128 .6 


528 


adh_zinc 


Zinc -binding dehydrogenases 


2,7e-34 


127.4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mlto carr 


Mitochondrial carrier proteins 


2.5e-8I. 


281-77 


533 


mito carr 


Mitochondrial carrier proteins 


2e-6l 


213.5 


£34 


thiolase 


Thiolase 


3 ,5e-183 


622 . 0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-55 - 


xbCTZ 


"537 


tRNA-synt_l - 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-l36 


" -466 . 0 ' 


538 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3 . le-136 


1 0 0 . u 


539 


tRNA-synt_l 


tRNA synthetases class I (X, t», 
M and V) 


1 . 9e-117 




"540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3 . le-136 


466 "6 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-B5 


295.7 


543 


2f -C2H2 


Zinc f inger, C2H2 type 


5 . 5e-£§ 




"$44 - 


TSlJFlOl 


Protein of unknown function 
DUF101 


8.5e-38 


139.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD40 


WD domain, G-beta repeat 


2 . 6e-32 




_ 548 


rhd" 


Rel homology domain (RHD) 


• 1 . 6e-238 


6>8 6 2 


'549 " 


MMR kSRl 


GTPase o£ unknown function 


S.4e-67 


236.0 


551 


HECT 


HECT-domain (iihirnii t* *!*•.«- 
transferase) . 


*i . JO* X.A 1 


435 • 6 


"554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3 . 5e-74 


O CO 0 


555 " 


z£-UBRl 


Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 ■ 


109.7 


561 


AMP-binding 


AMP-binding enzyme 


2 . 8e-06 





562 


PABP ■ * 


Poly- adenylate binding protein, 
unique domai 


4.9e-38 "' 


"139.8 


564 


Gag _p3 0 


Gag P30 core shell protein 


1. 2e-67 


238 . 2 


566 


PWWP 


PWWP domain 


8 .le-16 


66.0 


567 


SCAN 


SCAN domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 5e-84 


294 .3 


570 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hydrolase 


Carbon-nitrogen hydrolase 


0.00081 


-79.7 


572 


myosin_head 


Myosin head (motor domain) 


0 


1495.2 


573 


myosin^ head 


Myosin head (motor domain) 


0 


1490.4 


575 


Surp 


Surp module 


1.7e-23 


91. S 


576 


Surp 


Surp module 


1.7e-23 


91 . 5 


577 


DNAjpolja 


DNA polymerase family B 


0 


1138. £ 


S78 


pbz 


PDZ domain (Also known as DHR 1 
or GIiGF) . 


8.3e-09 | 


42 . 7 


579 


LRR _ 


Leucine Rich Repeat 


4.9e-21 


83.3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


S.9e-177 


861.3 


'583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


S84 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116.3 


586 


KH- domain 


kh domain 


2.9e-13 


57.5 


587 


G-patch 


G-patch domain 


2.3e-14 


<5l.2 


589 


L1M T 


HM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114.7 


591 


bromodoma in 


Bromodomain 


S.6e-32 


114.7 



252 



WO 01/53312 



PCT/US00/34263 



SEQ ID 

NO: 




U&QUAl XT X XVJJ.1 


p- value 


PFAM 


532 


hormone_rec 


Ligand-binding domain of 

nucl^air hmrnnnp 


3 .5e-22 


87.1 


593 


PHD 


PHD- finger 


3.8e-12 


53.8 


594 




C* adh*» rin dnrna i n 


a Op. qq 


Jfk^ • / 


596 




domain 


■cp.03" 


Ji? . ^ 


597 


WD40 


WD domain G-heta ir**ri*»at* 


n 000^4 

\J • UUU9l 


*o ♦ / 


600 


FG-GAP 


PCS— GAP VPnpfl t* 




£06 ■ 7 


602 


G_Adapt_CT 


Gairana-adaptin, C- terminus 


l.le-53 


191.8 


"603 




PnVra vt/rtt" 1 n Y*rth a{n Vina 


2 . 3e- 86 


inn 4 


605 


Col. lotCfeil 


(20 copies) 


8e-42 


152 . 4 


606 


tnifco 




6 • 3e-6>7 


232.3 


608 


PWWP 


r pi n r uuiuqxu 


2 . 6e-28 


107 5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 




PAD r*T V 

l*~/Vf oJj i 


CAP- Gly domain 


0 . 0046 


20 .1 


til* 


RFX_DNA_bind 
ing 


RFX DNA-binding domain 


5.2e-54 


192.9 


bib 


kinesin 


Kinesin motor domain 


1 . le-81 


284 . 8 




kinesin 


Kinesin motor domain 


8.4e-80 


278.5 


618 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0 , 0098 


13 .1 


620 


MATH 


MATH domain 


7.8e-0£ 


22.2 


621 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 . 4e-40 


146 . 6 


623 


BNR 


BNR repeat 


2.16-11 


51.3 




molybdop t eri 
n 


Prokaryotic molybdop ter in 
oxiuoreuuccas 


1 . 4e-12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cWM P_o xnai ng 


Cyclic nucleotide -binding 
domain 


3 , 7e-58 


206 .6 


oj\j 


aon snorL 


snort cnain uenyatogeuase 


ae-i / 


m n 
/ u . u 


631' 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307.1 




rrm 


RNA recognition motif . 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 




Fork__nead 


Fork head domain , 


o . 9e«* / 


1UJ . U 


bj / 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 8e-70 


246 . 5 


642 


TPR 


TPR Domain 


4.8e-08 


40.1 


643 


ef hand" 


on nana 


1 . 9 6 - 27 


1 OA C 
lU«k . 0 


647 




uuuiaiii 


i Oq.i rti 
j. . ze- ± uj. 


JD1 ■ 1 


64 3 


Ps eudoU_synt 
h 2 


RNA pseudouridylate synthase 


1 . 9e-55 


197.6 


650 


zf-C2H2 


Zinc fina^r* tvnp 


0 . 0087 


22 . 7 


651 


ante ' 




l75e-i7 


71 . 9 


652 ! 


I LWEQ 


I/LWEO domain 


9 . 5e-101 


341 . 0 


653 


neur ehan 


A1 w V*^ V 111 J* V* V* wx ^jf^LwVl IwlJ' 

channel 


4 . le-171 


581 . 8 


654 


tsp 1 


ThT'omhosoftttdin tvne l dnmAin 


4 . le-47 


169 . 9 


659 


FH2 




le-107 ! 


371 . 2 


661 




Poti domain - N-fcArrni nal hrt 

homeobox domain 


5 .3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


74.2 


664 


C2 


C2 domain 


6.7e-19 


76.2 


<^7 


GST 


Glutathione S-transf erases . 


9.3e-34 


114.4 


668 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


*71 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


672 


ABC_tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 4.8e-24 


93.3 
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SEQ ID 
NO; 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


"575 


WD40 


WD domain, G-beta repeat 


4.8e-24 


$3 .3 


676 


LRR 


Leucine Rich Repeat 


" ti.0015 


25 . 2 


679 


zf-CCCH 


Zinc finger C-xB-C-x5-C-x3-H 
type 


' 2.4e-29 - 


107.7 


^80 


z£-C2H2 


Zinc finger, C2H2 type 


5.2e-05 


30.1 


681 


CH 


Calponin homology (CH) domain 


" 2.4e-17 


"71.1 


682 


" DSPc 


Dual specificity phosphatase, 
catalytic doma 


4.3e-43 


156.6 


683 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.051 


10.8 


687 


Synapsin 


Synapsin 


0 


1890.8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038.8 


691 


homeobox 


Homeobox domain 


8.5e-30 


112.4 


696 


Peptidase M2 
4 


metallopeptidase family M24 


2.6e-59 


""210.5 


697 


RhoGEF 


RhoGEF domain 


9.5e-35 


128.9 "| 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422 . 0 


702 


sulcata s e 


Sulfa tase 


3e-231 


781.6 


703 


zf-C2H2 


Zinc finger, C2H2 type 


5\7e-2b 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


l.le-22 


88 .8 


708 


WD40 


WD domain, G-beta repeat 


4.8e-19 


76.7 


710 


Ran_BPl 


RanBPl domain. 


8 ,4e-06 


-7.3 


713 


DEAD 


DEAD/DEAH box hel lease 


5.9e-42 


134.9 


714 


"PK 


PH domain 


1 . 6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1 . 5e-37 


138 . 2 


717 


Sialyl trans £ 


Sialyltransferase family 


7 ,5e-31 


11$ . 9 


718 




Immunoglobulin domain 


le-29 


100.8 


"719 


integrin B 


Integrms, beta chain 


0 


1125 . 4 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase_C2 


Calpain family cysteine 
protease 


3e-145" 


495 . 9 


723 




Immunoglobulin domain 


2.2e-05 


22 .4 


724 


F-box 


F-box domain. 


0.007 


23.0 


725 


Nop 


Putative snoRNA binding domain 


8 .le-58 


205 .5 


72 6 


Nop 


Putative snoRNA binding domain 


fi.ie-58 


205.5 


727 


WD40 


WD domain, G-beta repeat 


7.5e-26 


99.3 


730 


dsrm 


Double- stranded RNA binding 
motif 


0.027 


12.1 


731 


dynamin 


Dynamin family 


4.2e-l6 


66.9 


733 


zf-CCCH 


Zinc finger C-xB-c-x5-C-x3-H 
type 


2.8e-10 


41. 1 


73 S 


CDP- 

OH_P_transf 


CDP- alcohol ~ 
phosphatidyl transferase 


4.2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8.6e-57 


182. 5 


739 


TSC22 


TSC-22/dip/bun family 


6.5e-32 


119. 5" 


742 


ras 


Ras family 


2.2e-10O 


346.9 


743 


PMI_typeI 


Phosphomannosc i some rase type I 


1.2e-243 


822.9 


747 


trypsin 


Trypsin 


6.4e-88 


279.4 


748 


Jcazal 


Kazal-type serine protease 
inhibitor domain 


2.2e-52 


187.4 


749 


efhand 


EF hand 


6.3e-06 


33 .1 


751 


PHD f 


PHD- finger 


4.9e-l6 


66 .7 


752 


z£-C2H2 


zinc finger, C2H2 type 


3.2e-21 


83.9 




Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754" "~ 


Ribosomal L3 
9 


Ribosomal L39 protein 


0.00018 i 


26.7 


755 


PH ' 


ph domain 


3.6e-14 


55.7 


758 


SCAN 


scan domain 


1.4e-53 


191.5 


759 
U 7(!o 


PA 


PA domain 


0.0065 


23.1 | 







ADP-ribosyiation factor family " 


2.2e-l9 


77.8 


"761 * 


CIDE-N 


C1DE-N domain 


2.2e-40 


147.6 
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SEQ ID 
NO: 


PFAM NAME ' " 


" DESCRIPTION " 


p- value 


PFAM 
SCORE 


762 


histone ~ ~ ~ 


Core histone H2A/H2B/H3/H4 


9. 9e-53 


188.6 


763 


zf-MYND 


MYND finger 


4 . le-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188 . 6 


767 


vwc 


von Willebrand factor type C 
domain 


2.9e-34 


127.3 - 


769 


erhand 


EF hand 


4.8e-ll 


50.1 


770 


zf-C4 


Zinc finger, C4 type (two 
domains) 


2 .4e-53 


181 .6 


772 


ras 


Ras family 


7e-9d" 


312 .0 


773 


Sulfatase 


Sulfatase 


ie-142 


487.5 


77S 


zf -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


777 


zf-C2H2 


Zinc finger, C2H2 type 


l . le-12 


55.5 


778 
1 779 


rrm 


RNA recognition motif. 


2 .le-32 


121.1 


i 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


1 . 5e-76 


236 , 6 


; 780 


spectrin 


Spectrin repeat 


3 . 7e-29 


110 .3 


781 


mito carr 


Mitochondrial carrier proteins 


4 .6e-57 


198 . 5 


782 


SCAN 


SCAN domain 


1 . 3e-24 


95 . 2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


4.1e-07 


37 1 1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


"21.7 


786 


ras 


Ras family 


5 . 3e-39 


143.0 


787 


RNaee HI I 


Ribonuclcase HII 


2 . 5e-67 


237 . 1 


790 


PI3_FI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5 .4e-108 


372 . 2 


795 


cadherin 


Cadherin domain 


2 . 5e-40 


147 . 4 


796 


ARID ~ ' 


ARID DNA binding domain 


1.6e-20 


81.6 


"7$7 


trypsin 


Trypsin 


9 .9e-20 


64 . 8 


799 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63 . 8 


801 


Gal- 

bind lectin 


Vertebrate galactoside-binding 
lectin 


4 .le-25 


88.7 


803 


WD40 " 


WD domain, G-beta repeat 


0.00082 


26.1 


806 


TBC 


TBC domain 


1.6e-26 


101 . 4 


807 


TBC 


TBC domain 


1. 8e-26 


101 . 4 


808 


CN hydrolase 


Carbon-nitrogen hydrolase 


8.8e-80 




811 


CBFD.NFYB mT~ 
F 


Hi st one -like transcription 
factor 


fie- 14 


CO Q 

• o 


812 


adh short 


short chain dehydrogenase 


8 .le-20 


79 . 3 


814 


IMP4 


Domain of unknown function 


3 .3e-71 


250 , 0 


815 


Zf -C2H2 


Zinc finger, C2H2 type 


8.2G-66 


232.1 


816 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


"fTSe-3 7 


138 . 0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74.3 


826 


IFS_eIF4 elF"" 
2 




1.6e-32 


221 .5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


831 


LRR 


Leucine Rich Repeat 


2.1e-26 


101 . 1 


832 


lamininJSGF 


Laminin EGF-like (Domains III 
and V) 


2e-57 


204 .2 


839 


rrm 


RNA recognition motif. 


1.3e-22 


88 , £ " 


840 


Yjohosphatas 
e 


Protein- tyrosine phosphatase 


2.6e-119 


409J8 


"841 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-100 


346 . 3 




Ribosomal L2 
2e 


Ribosomal L22e protein family 


le-6~4 


228.4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


ZI-C3HC4 


Zinc finger, C3HC4 type (RlNG 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine- 


0 


1025.4 | 
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SEQ ID 
NO: 


PFAM NAME 


"DESCRIPTION 


p -value 


PFAM 
SCORE 






rich domain 






853 


SRCR 


Scavenger receptor cysteine- 
rich domain 


0 


1025 .4 


857 


lactamase B 


Metallo-beta-lactamase — — 
superfamily 


0.012 


•6,0 


B58 


COX6A 


Cytochrome c oxidase subunit 
via 


3.4e-58 


206.7 


B59 


rrm 


RNA recognition motif. 


5.4e-45 


162.9 


861 


PRK 


Phosphoribulokinase 


5.1e-62 


219.4 


863 


mito carr 


Mitochondrial carrier proteins 


2.9e-53 " 


185.5 


06 J 4 


HSP90 


Hsp90 protein 


' 4.7e-1^8 


538.5 


866 


ig 


Immunoglobulin domain 


4e-12 


44 .1 


867 


z£-C2H2 


Zinc finger, C2H2 type 


7e-135 


4^1.5 


872 


histone 


Core histone H2A/H2B/H3/H4 


4 . 9e-41 


149.8 


874 


cpsase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


" 2.ie-218 


739 . 0 


879 


Ribosomal SI 
2e 


Ribosomal protein S12e 


2.ie-98 


340.3 


882 


serpin 


Serpins (serine protease 
inhibitors) 


2.5e-42 


145.7 


883 


Patatin 


Patatin 


1 . 2e-51 


182 . 0 


884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0 . 044 


8 . 0 


887 


DUF92 


Integral membrane protein DUF92 


2 . 7e-12 


54 . 3 


889 


sugar_tr 


Sugar (and other) transporter 


S.2e-63 "" 


222 .1 


893 


DUF28 


Domain of unknown function 
DUF28 


1 . 3e-43 


158 . 3 


~B96 


lP_trans 


Phosphatidyl inositol transfer 
protein 


£ .5e-98 


338 .7 


898 


DEAD 


DEAD/DEAH box helicase" 


1.5e-48 


15^ .5 


899 


KE2 


KE2 family protein 


7e-61 


215.7 


900 


KE2 


KE2 family protein 


4.3e-5l 


183 .2 


901 


2f-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203 .8 


902 


ras 


Ras family 


2.3e-75 


263 .8 


904 


TPR 


TPR Domain 


3.2e-22 


87.2 


906 


GBP 


Guanylate -binding protein 


8.9e-253 


853 .1 


907 


GBP 


Guanylate-binding protein 


l.le-239 


809.6 


908 


WD40 


WD domain, G-beta repeat 


2 .6e-26 


100 .8 


909 


PH 


PH domain 


~I.3e-09 


3£.4 


910 


zf-C2H2 


Zinc finger, C2H2 type 


2.5C-39 


144 .1 


913 


Epimerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


-88.5 


921 


XBC 


TBC domain 


1.5e-09 


30.7 


922 


WD40 


WD domain, G-beta repeat 




98.2 


923 


WD40 [ 


WD domain, G-beta repeat 


8.2e-Q7 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9e-05 


29.1 


925 


UQ_con 


Ubiquit in- conjugating enzyme 


0.60033 


-27:6"" 


926 


CH 


caiponin homology (CH) domain 


3.3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5 . 9e-48 


172 .7 


929 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3 .le-io 


37.4 


930 


Ribul P 3 ep 
ira 


Ribulose -phosphate 3 epimerase 
family 


7.2e-10S 




931 


Ribul P 3 ep 
ira 


Ribulose-phosphate 3 epimerase 
family 


1 .2e-96 


334 .4 


936 


C2 


C2 domain 


2.2e-62 


220.7 


/ 


NAP_family 


Wucieoeome assembly protein 
(NAP) 


l.le-i!2 


64. £ 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins " 


3.2e-07 


25.1 


948 


pkinase 


Buxaryotic protein kinase 
domain 


3.4e-75 


263 .2 


949 


WD4Q 


wd domain, G-beta repeat 


1.8e-27 


104.7 


950 


Acyl transfer 
ase 


Acyl trans f erase 


1.6e-07 


38.4 | 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p~value 


SCORE 


951 


SAM 


SAM domain (Sterile alpha 
motif) 


0 .014 


14 . 5 


954 


GFO IDH MocA 


Oxidoreductase family 


l-.3e-ll 


52 . 0 


955 


BTB 


BTB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


7e-22 


86 .1 


957 


CDP- 

OH P transf 


CDP- alcohol 

phosphatidyl transferase 


" 0.053 


-22.2 


959 


ras 


Ras family 


2.4e-97 


336.8 " 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


Acetyltransf 


Acetyl transferase (GNAT) family 


1.2e-08 


42.2 


962 


adh_short 


short chain dehydrogenase 


2.4e-31 


117.6 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.2 * 


969 . 


IP-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653.9 


970 


RNase PH 


3» exoribonuclease family 


9e-24 


92 4 


975 


WW ~ 


WW domain 


5.7e-2S 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3 . 6e-21 


q *i n 
Oj . / 


978 


Ribosomal Li 
7 


Ribosomal protein LI 7 


2.4e-20 


81.0 


"979 


"ITEM 


LIM domain containing proteins 


5 . 8e-42 




980 


Calsequeotri 
n 


Calsequestrin ~ 


1 . 7e- 297 


1 (\f\t i 
lUwl • 7 


982 


HSP20 


Hsp20 /alpha crystailin family 




' a\ 3 


983 


oxidored_q6 


20 Kd sub 


A O e »_(i'i 
*k • OB*BJ 


100 a 
c<z< .9 


988 


TBC 


TBC domain """"" 




180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 


993 


tRNA_int_end 
o 






-34 .2 


994 


homeobox 


Homeobox domain 


id 


_ 

73 • 6 


997 


pyr_redox 


Pyridine nucleotide-di sulphide 
oxidoreducta 




11 . 6 


1000 


mito_carr 


Mitochondrial carrier proteins 


9.7e-123 




1001 


RA 


Ras association (RalGDS/AF-6) 
domain 


1 . 2e-15 


id 4 


"1004 

- 


TXJF81 


Domain of unknown function 
DUF81 


0.099 


in o 


1005 


act in 


Actin 


1 .3e-174 


574 .3 


1006 


actin 


Actin 


3.1e-130 


428 .6 


1007 i 


cpn60_TCPl 


TCP-i/cpn60 chaperonin family 


3 ,7e-195 


661 . 8 


1008 


TPR 


TPR Domain 


B.le-44 


159.0 


1009 


zf -C2H2 


Zinc finger, C2H2 type 


3 . 6e-61 


216.6 


1011 


z£-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 * 


1012 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4 .7e-l3 


53 ;i 


1016 


tRNA-eynt 2c 


tRNA synthetases class II (A) 


2 .3e-15 


55.2 


101B 


RhoGAP 


RhoGAP domain ~ 


1.6e-78 


2 74 . 3 


1022 


PGAM 


Phosphogiycerate mutase family 


3.8e-18 


69.7 


1026 


HMG_box _ ■ 


HMG (high mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ_con 


Ubiquitin- conjugating enzyme 


1 .4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0 .028 


16 ,3 


1034 


Hydrolase 


haloacid dehalogenasc-like 
hydrolase 


2e-21 


84.6 """ 


1037 


KRAB 


KRAB box 


4.8e-06 


32,4 


1038 


Cation_eff lu 

X 


Cation efflux family 


7.1e-42 


152. $ 


1040 


ART 


NAD:arginine ADP- 
ribosyltransferase j 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


1.9e-l8 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93.7 "■■ 


1045 


lectin c 


Lectin C-type domain 


1.9e-28 


108.0 


104* 


Glucosamine 
iso " | 


Glucosamine - 6 -phosphate 
isomerase 


0.00013 


-25.1 
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SEQ ID 

NO: 

1047 


PFAM NAME 
iigase-CoA 


DESCRIPTION 

CoA-iigases *~ 


p-value 
4.5e-80 


PFAM 
SCORE 
279 .4 


1049 
1050 

ITJsl 


ig 

Ribosomal L2 
4e 


"immunoglobulin domain 

Ribosomal protein L24e 


1.7e-09 
2e-33 


35.6 
124.5 




Amidase 


Ami das e '~ 


4.3e-152 


518 .7 


1055 


rrm 


RNA recognition motif. 


3.8e-26 


"100.3 


1058 


annexin 


Annexin 


6.9e-44 


159.2 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


1060 


homeobox 


Homeobox domain 


3.2e-31 


117.2 


1062 
1064 


Acyltransfer 
ase 

AMP-binding " 


Acyl transferase 
AMP-binding enzyme 


" 0.000*5- " 


" 16.5 


1065 


LRR 


Leucine Rich Repeat 


6.6e-l00 
3 .3e-14 


345.3 
60.6 


1066 


GTP1 OBG 


GTP1/OBG family 


4.8e-41 


141.6 


1071 


ig 


Immunoglobulin domain 


8.4e-48 


159.1 


1072 


PHD 


PHD- finger 


6. 8e-07 


36 .3 


1074 


DENN 


DENN (AEX-3) domain 


8.3e-33 


121.5 


1075 


SCP 


SCP-like extracellular protein 


4 . 7e-41 


149."8 


1077 


OLF 


01 fact cmedin- like domain 


2.2e-6^ 


234.0 


1078 


mi to carr 


Mitochondrial carrier proteins 


ie-42 




1079 


WD40 


WD domain, G-beta repeat 


6.2e-45 


162.7 


1067 


START 


START domain 


1 . 5e-4Q 


1 1 A 1 


1093 


DSPC 


Dual specificity phosphatase, 
catalytic doraa 


^ , 3e-63 




1094 


GSHPx 


Glutathione peroxidases 


9 . 6e-41 




1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


•JC4 ft 


1096 


.DUF25 


Domain of unknown function 
DUF25 


6e"^7"5~ 


"2^2.4 


1105 


Nitroraducta 
se 


Nitroreductase family 


1 . 3e-l3 


58 . 6 


"1166 


PTE 


Phosphotriesterase family 


1 . 3e-179 


610.1 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


0.00049 


19.6 


1109 


ras 


Ras family 


1 . 3e-15 


40.7 


1115 


ArfGap 


Putative GTP-ase activating 
protein for Arf j 


9.7e-47 


168.7 


1116 


HM014 17 


HMG14 and HMG17 


4 . 4e-21 


83 . 5 


1117 


HMG14_17 


HMG14 and HMG17 


"9.9e-12 


"52.4 


1119 


FAA_hydrolas 
e 


Fumarylacetoacetate (FAA) 
hydrolase fam 


2e-83 


290.6 


1120 


pkinaae 


Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 


1123 


abhydrolase 


alpha/beta hydrolase fold 


9.2e-23 


89.0 


112 9 


pro_isomeras 

B 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2.2e-56 


197.1 


1131 


DnaJ 


DnaJ domain 


1.6e-30 


114.9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78 . 6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-l£ 


64.9 


1134 


PH 


PH domain 


0. 0015 


17. 8 


113* 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


1.2e-256" " 


866 . 0 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


768.8 


1139 


ras 


Ras family 


1 ,5e-86 


301.0 


~1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyltransfer 
ase 


Acyl transferase 


1.29-05 


29.9 


1153 ' 


IRS 


PTB domain { IRS- 1. type) 


5.4e-5S 


196.1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 


106.9 ^ 


TiST 


Asparaginase 
_2 


Asparaginase 


6.4e-72 


252.3 


1159 


SMC_oxred 


3MC oxidoreductases 


4.7e-142 


485.3 


1160 


zr-ANl ~ 


MU.-like Zinc finger 


0.00021 


27.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


iinkerjiisto 
ne 


linker histone Hi and H5 family 


3.8e-14 


60.4 


1164 


DED " " " 


Death effector domain 


3.9e-05 


30.5 


1165 


IRS 


PTB domain (*RS-1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain (IRS-i type) 


2.6e-43 


157.3 


1168 


SAM 


SAM domain (Sterile alpha 
motif) 


0.04 


10.5 


1170 


abhydrolase 


alpha /bet a hydrolase fold 


O.09B 


-7.5 


1174 


SAP 


SAP domain 


3.9e-10 


47.1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112.5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-35 


12" 9. 9 


1180 


Ets 


Ets-domain 


1.8e-09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0. 00016 


24.7 ■' 


1182 


TCL1 MTCP1 


TCL1/MTCP1 family 


9.5e-S6 


198.6 


1184 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR_LY6 


u-PAR/Ly-6 domain 


0.0042 


15.6 


1188 


Om DAP Arg 
dec ' 


Pyridoxal -dependent 
decarboxylase 


6.2e-128 


■"430.* 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


1195 


Seel 


Seel family 


3 .2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide -di sulphide 
oxidoreducta 


3.1e-32 


111 . 8 


1197 


Glyco transf 
_8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16 . 8 


1203 


adh_short 


short chain dehydrogenase 


8.3e-45 


162.3 


1206 


Ubie_roethylt 
ran 


ubiE/COQ5 methyl transferase 
family 


1.3e-121 


417.4 


1208 


7tm_3 


7 transmembrane receptor 


7.2e-09 


29.0 


1209 


ank 


Ank repeat 


3.9e-15 


63.7 


1210 


VATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


efhand 


EF hand 


3.2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


i.5e-71 


251.1 | 


1223 


G -gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 


PX domain 


2.2e-15 


64.5 


1233 


PX! 


PX domain 


2.2e-15 


64.5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-09 


44.0 


1241 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 


UPF0006 


Metalloenzyme of unknown 
function TJPF0006 


6.3e-61 


215.8 


1248 


Glycos trans 
f_2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


efhand 


EF hand 


4e-ll 


50.4 


1254 


UQ_con 


Ubiqui tin -conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


formyl_trans 
f 


Formyl transferase 


4.9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1261 


DiHtolate re 
d ~ | 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G_giu transp 
ept 


Gamma-glutamyl transpeptidase 


1.8e-110 


380.4 " 


1263 


PAS 


pas domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4.2e-22 


86.9 
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SEQ ID 
NO: 


PFAM NAME ~ 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1266* 


""SCP 


SCP-like extracellular protein 


6e-29 


108.0 


1267 


K_tetra 


K+ channel tetramerisation 
domain 


2 . 8e-27 


LUl a U 


1269 


ras 


Ras family 


1. 3e-85 


297 . 9 


1275 


zr-C3HC4 


Zinc finger, C3HC4 type {RING 
finger) 


4 . 2e-10 


37 0 
9 l » v 


1276 


abhydrolase 


alpha /beta hydrolase fold 


5.4e-23 


89.8 


1277 


abhydrolase 


alpha /beta hydrolase fold 


5.6e-21 


83.1 


1279 


trypsin 


Trypsin 


4 . 4e-41 


132775 


1260 


PBP 


Phosphat idyletbanolamine- 
binding protein 


1.3e-13 


58.7 


128S 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5 . 6e-14 


■ "49.6 


1287 


ank 


Ank repeat 


I . 7e-52 


187 . 8 


1294 


£n3 


Fibronectin type III domain 


0,026 


20.9 


1295 


GBP 


Guanylate -binding protein 


V ft \J \J W V 


-m n 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese- like domain 


3 . 2e-14 


60 . 7 


1298 


LIM 


LIM domain containing proteins 


5,8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 






1307 


mito^carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


1308 


WD40 


WD domain R— bp a r^Anoat' 


JL . be-17 


71 . 6 


1316 


U*PAR LY6 


u- PAR /IiV- C rfnma i Ti 




75 . 5 


1313 


thiored 




\ Cent: 


21 . 6 


1314 


Aa_trans 


Transmembrane amino acid 

Lj.auiy^JUi Let ^i*UlfCli( 


1.5e-*7 


237.9 


1316 


trypsin 




A /1q_A1 


132 . 0 


1320 


3 


Ribonnim] nrnhftin r.1 1 


j . se-bi 


219 . 8 


1327 


Artnad.il lo se 

g 


repeats 


ft AACA 


23 .4 


1328 


KRAB 




0 . 0S2 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bel- 2 


^r^r^^*'^* tcguiatut piuLQinSi 

Bel- 2 family 


0 , 014 


-1.6 


1331 


PX 




2 . le-10 


48 . 0 


1333 


KRAB 




j. . be- Jo 


134 . 6 


1334 


UPP syntheta 
se 


diphosphate synt 


O 1 n DO 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1 , 8e-59 


oi 1 n 
Zil . U 


1336 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


1 . 2e-31 


118.6 


1337 


dspc 


Dual specificity phosphatase, 
catalytic doma 


2 , 3e-12 


54 . 5 


1338 


TPR 


TPR Domain 


0 . 00021 


28.1 j 


1340 


metal thio 


Metal lothionein 


0.013 


20 . 3 


"1341 


mutT 


Bacterial mutT protein 


5.8e-09 


36.5 


1343 


Band 41 


FERM domain (Band 4.1 family) 


1.3e-38 


122 . 5 


1344 


Kelch 


Kelch motif 


1.4e-44 


161.5 


1345 


Antifreeze 


Antifreeze protein 


1 . 2e-l0 


4 8.8 


1347 


3Beta H&D 


3 -beta hydroxys tero id 
dehydrogenase/ isomer a 


0 .086 


-177 . 2 


1348 


BTB 


BTB/P0Z domain 


5.3e-28 


106 . 5 


1349 


DUF6 


Integral membrane protein DUF6 


0 , 033 


15 . 8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.7 


1352 


toramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 


209.0 


13 56 


■ " 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203 .1 


1360 


Zf-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 " 


HMG14 and HMG17 


7.9e-40 


145.7 
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SBQ ID 
NO : 


PFAM NAME 


DESCRIPTION 


p — va lu e 


PPAM 


1362 


SIS 


SIS domain 


3 ,8e-30 


113 . 6 


1363 


SIS 


SIS domain 


1 . 3e-28 


108 * 5 


1364 


ig 


Immunoglobulin domain 


0. 00026 


19 . 0 


1368 


K_tetra 


K+ channel tetramerisation 
domain 


l.le-16 


£8.9 


U7l 


collagen 


Collagen triple helix repeat 
(20 copies) 


2.2e-113 


390 .1 


1372 


DnaJ 


DnaJ domain 


6.6e-36 


132 .7 


1376 


KRAB 


KRAB box 


2.Ie-38 


"141/0 


1378 


ELM2 


ELM 2 domain 


" '2e-23 


91.3 


i36G 


thiored 


Thloredoxin 


1.2e-23 


"^82.8 


1381 


ank 


Ank repeat 


2.3e-83 


290 .4 


1382 


BTB 


btb/poz domain 


3e-ll 


'"50.8 " 


13 B3 


WD40 


WD domain, G-beta repeat 


1 ,6e-19 


78 .3 


"1384 


WD40 


WD domain, G-beta repeat 


£ .3e-24 


92 .9 


1387 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1 .le-09 


" 35. €"' 


1389 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-50 


"179.5 """ 


1390 


zf-C2H2 


Zinc finger, C2H2 type 


2 .5e-85 


296.9 


1393 


kinesin 


Kinesin motor domain 


7 . 8e-188 


"637.4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1 . 2e-49 


X7 8 '4 


1398 


KRAB 


KRAB box 


5.1e-22 




1402 


bZIP 


bZIP transcription £actor 


0 . 035 


U . 1 


1405 


sugar_tr 


Sugar (and other) transporter 


0 . 003 


- 101 . 5 


1406 


RhoGAP 


RhoGAP domain 




168.8 


1407 


rrm 


RNA recognition motif. 






1408- 


LRR 


Leucine Rich Repeat 




58 . 0 


1409 


Nebulin repe 
at 


Nebulin repeat 




192 . 6 


1410 


ank 


Ank repeat 


1.6e-17 


/ 1 . o • 


1412 


Ribosomal L5 
JC 




8 . 2e-58 




1415 


trypsin 


Trypsin 


4 . 7e-85 


.270". 4 


1416 


aminotran l 


Aminotransferases class -I 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1 . 6e-C7 


jj . i 


1419 


WD40 


WD domain, G-beta repeat 


2 . 2e-09 


44 . 6 


1422 


cadherin 


Cadherin domain 


8 . 3e-42 


152 • 3 


1424 


SH3 


SH3 domain 


2 . 5e-B0 


o an *i 

ZOKI • J 


1425 


PHD 


PHD- finger 


3 . 2B-17 


f u . o 


1426 


PHD 


PHD-finger 


3 .2e-17 


70.6 


1427 


ArfG-ap 


Putative GTP-ase activating 
protein for Arf 


le-SI 


"138.8 


1428 


holicase_C 


Helicases conserved C- terminal 
domain 


le-26 


102 . 2 


1429 


WD40 


WD domain, G-beta repeat 


3 . 9e-07 


37.2 


1430 


inositol_P 


Inositol monophosphatase family 


2.5e-10 


40.2 


1431 


mi to carr 


Mitochondrial carrier proteins 


4.3e-83 


287.7 


1433 


Clq 


Clq domain 


2 .9e-16 


66.2 


1434 


WD40 


WD domain, G-beta repeat 


1.6e-13 


58.3 


1435 


Inos-l- 
P_synth 


Myo - inos i t ol - l -phoapha t e 
synthase 


7e-228 


770.4 


1436 


rrm 


RNA recognition motif. 


1.4e-34 


128.3 


1438 


ig 


Immunoglobulin domain 


1.3e-12 


45.6 


1440 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


3 .4e-67 


236.7 


1441 


G Adapt CT 


Gamma-adaptin, C-terminus 


3 .4e-67 


234.7 


1443 


Kelch 


Kelch motif 


0.00013 


28 .7 


1446 


ARID 


ARID DNA binding domain 


1.8e-21 


84.7 


1447 


zf -C2H2 


zinc finger, C2H2 type 


9.4e-28 


105.6 " 


144B 


AMP-binding 


AMP-binding enzyme 


2.4e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6.5e-2l 


82.9 


1454 


*9 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyl trans f 


Sialyl transferase family 


5.4e-21 


83.2 


1460 


Aldose__epim 


Aldose l-epimerase | 


1.9e-3^ - 


131.2 


1461 


C2 


C2 domain 


4e-18 


73.6 


1470 




lPT/TIG domain 


3.1e-19 


77.3 


1472 




RNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


~ PFAM 
SCORE 












1474 


DENN 


" DENN (AEX-3) domain" 


1.3e-44 


161.6 


1475 


Cation_ef flu 

X 


"cation errlux family " 


4.6e-49 


176 . 4 


1477 


TBC 


TBC domain 


8e-47 


169 . 0 


1478 


rrm 


RNA recognition motif. 


2e-21 


84.6 


1480 


ig 


Immunoglobulin domain 


5.5e-06 


24 .3 


1484 


TeloJbind_al 
pha 


Telomere -binding protein alpha 
subuni 


0.028 


-225.9 


1485 


zf-C2H2 


Zinc finger, C2H2 type 


' 1.86-68 


" 240.9 - 


"Hag 


pjcinase 


Eukaryotic protein Kinase 
domain 


9.Se-13 " " " 




1488 


helicase_C 


Helicases conserved C- terminal 
domain 


1.4e-15 


65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 


" 0.079 


-132.4 


1496 " 


ECH 


Enoyl-CoA hydra tase/i3omerase 
family 


5 . 2e-41 




1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5 . 9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


3.4e-19 


77.2 — 


1495 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7 . le-10 


36* . 3 " 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 


1500 


SH3 


SH3 domain 


3 .3e-0S> 


27 . 2 


1502 


homeobox 


Homeobox domain 


0 .084 


13 . 8 


1503 


homeobox ~ 


Homeobox domain 


0.084 


13.8 


1505 


EGF 


EGF- like domain 


2 . 7e-23 


90 . 8 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2 . 7e-21 


"84 .2 


1508 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2 . 8e-28 


*vi • D 


1511 


PX 


PX domain 


1 . 9e-ll 


51 . 5 


1512 


sulfatase 


Sulfatase 


2 . 8e-35 


13 0.7 


1516 


syntaxin 


Syntaxin 


0 , 011 


-62 .3 


1518 


aminotran_ 3 


Amino transferases class- III 
pyridoxal-pho 


9.7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


0 .075 


11.0 


1521 


RA 


Ras association lRalGDS/AF-6) 
domain 


0_6l3 


13T5 


1523 


RhoGAP 


RhoGAP domain 


2.5e-05 


18. 7 


1528 


WD40 


WD domain, G-beta repeat 


5.4e-24 


93.1 


1535 


IMS 


impB/mucB/samB family 


7.8e-95 


328.5 


1538 


FYVE 


FYVE zinc finger 


3 . 2e-27 


101.5 


1539 


DAGKc 


DiacylglyceroX kinase catalytic 
domain 


6*6-07 


36.5 


1540 


Ocular alb 


Ocular albinism type 1 protein 


0 


1184 . 7 


1653 


SAP 


SAP domain 


6e~06 


33.2 


"1654 


Amino_oxldas 
e 


Flavin containing amine oxidase 


3.2e-43 


T^7 . 6 


"l?55 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157. 0 


1656 


RhoGEF 


RhoGEF domain 


1.4e-24 


95.1 


1657 


MMR HSR1 


GTPase of unknown function 


0.0011 


-45.5 


1659 
4 ccn 


tiCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


2.5e-ll 


51 , 1 


Ad b U 


act in 


Act in 


6.6e-21 


49.9 


1661 


BAH 


BAH domain ^ 


1.78-82 


T87.£ 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 " 


1**3 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zf~C2H2 


zinc finger, C2H2 type 


1.3e-93 


324.4 


fl669 


NoIl_Nop2_Su 
n ~ 


N0Ll/N0P2/sun family 


1.36-23 


84.3 


1671 " " 


SH2 


Src homology domain 2 


5.4e-l5 


46.9 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-vaiue 


PFAM 
SCORE 


1672 


chromo 


' chromo ' (CHRromatln 
Organization Modifier) 


2.1e-18 


67.7 


1674 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


""17.6 


1676 


Glyco hydro 
47 


Glycosyl hydrolase family 47 


1.8e-187 


636.2 


1677 


Glyco hydro 
47 


Glycosyl hydrolase family 47 


' 4.5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1661 


W04 0 


WD domain, G-beta repeat 


l.le-27 


105.5 


1683 


MMR_HSR1 


GTPase of unknown function 


1.8e-78 


274.1 


1691 


rrm 


RNA recognition motif. 


1.8e-37 


137. $ 


1692 


rrra 


RNA recognition motif. 


1.8e-37 


137.9 


1693 


AAA 


ATPases associated with various 
cellular act 


1.3e-81 


284.5 


1697 


Ferric_ reduc 
t 


Ferric reductase iiice 
transmembrane com 


8.4e-87. 


285.2 


1698 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3.5e-53 


190.1 "~ 


1699 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-34 


126.6 


1700 


arf 


ADP-ribo3ylation factor family 


9e-19 


75.8 


1702 


GTP EFTU 


Elongation factor Tu family 


0.014 


11.4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194 .4 


'170? 


pkinaae 


Eukaryotic protein kinase 
domain 


1.2e-88 


307 .9 


1709 


WD40 


WD domain, G-beta repeat 


0.0035 


24.0 


1710 


I.RR 


leucine Rich Repeat 


1.2e-30 


115.3 


1711 


WW 


WW domain 


7.6e-12 


52.8 


1712 


ank 


Ank repeat 


4 . 2e-34 


126.7 


1713 


zf-CCCH 


Zinc finger C-x8-c-x5-C-x3-H 
type 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-tf 
type 


2.6e-09 


3B.3 


1715 


ras 


Ras family 


4,4e-41 


149.9 


1718 


HMG_box 


KMG (high mobility group) box 


B.3e-21 


82.6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


HLH 


Helix-loop-helix SNA-binding 
domain 


9.2e-10 


45.9 


1723 


dsrm 


Double- stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 


RrnaAD 


Ribosomai RNA adenine 
dimethylases 


0.04<S 


"9.2 


1725 


CIDE-N 


CIDE-N domain 


5.9e-40 


145.2 


1726 


HAT 


HAT (Half -A-TPR} repeats 


2.9e-44 


160 . 5 


1728 


erhand 


EF hand " 


5.1e-20 


79.9 


1733 


Hist_deacety 
1 


Histone deacetyiase family 


1.7e-104 


3*0.6" 


1735 


LRR 


Leucine Rich Repeat 


4.6e-34 


126.. 6 


1739 


PI- PLC -X 


Phosphatidylinositol- specific 
phpspholipase 


0.0023 


16.1 


1743 


ras 


Ras family 


3.7e-10 


-21.3 


1744 


ras 


Ras family 


3.7e-10 


-21.3 


1745 


RasGBF 


RasGEF domain "J 


3.2e-49 


176.9 


1 /46 


adh__short 


short chain dehydrogenase 


7.1e-0B 


34.6 


1751 


zf -C2H2 


Zinc finger, C2H2 type 


9e-.39 


142.2 


1754 


Cn3 


Fibronectin type III domain 


"S.Se-101 


348.9 


1756 


zf-C2H2 


zinc finger, C2H2 type 


6.3e-93 


322.1 


1758 




kka recognition motif. 


0.017 


21.2 


1760 


Nop 


Putative snoRNA blinding domain 


6.ie-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


e.le-95 


328.8 


1765 


MMR HfiRl 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CN_hydrolase 


carbon-nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


"4.1e-07 


37.1 


1779 


Oxysteroi_BP 


Oxy sterol -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 1 


1784 


RhoGEF ~~~ " 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS:!4I6227.1(%CRN0I LDOC) 
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TABLE 5 





SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MB AN 
SCORE) 


1 


1-21 


0.991 


0.955 


2 


1-31 


0.995 


0.944 


3 


1-33 


0.949 


0.736 


4 


1-19 


0.976 


0.951 


5 


1-26 


0.971 


0.863 


6 


1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 ] 


8 


1-26 


0.971 


0.863 


9 


1-46 


0.982 


0.901 


10 


1-21 


0.991 


0.955 


11 


1-23 


0.989 


0.899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0.938 


0.876 


15 


1-25 


0.941 


0.811 


16 


1-17 


0.972 


0.939 


17 


1-27 


0.964 


0.777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0.953 


0.840 


20 


1-20 


0.935 


0.701 


21 


1-22 


0.974 


0.850 


22 


1-33 


0.961 


0.895 


23 


1-19 


0.991 


0.959 


24 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


26" 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0.9O9 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.736 


46 


1-19 


O.S70 


0.951 


67 


1-25 


O.S68 


0.848 


71 


1-18 


0.949 


0.845 


72 


1-30 


0.991 


0.919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0.986 


0.945 


94 j 


1-33 


0.994 


0.943 




97 


1-46 


0.964 


0.595 




103 


1-49 


0.983 


0.570 




108 


1-26 


0.978 


0.885 




111 


1-23 


0.989 


0.899 




126 


1-25 


0.955 


0.803 




129 


1-19 


0.963 


0.918 




138 


1-29 


0.971 


0.844 




143 


1-18 


0.914 


0.628 




148 


1-20 


0.969 


0.904 




156 


1-25 


0.941 


0.811 




158 


1-22 


0.979 


0.927 




160 


1-17 


0.972 


0.939 




161 


1-48 


0.903 


0.571 i 




162 


1-25 


0.937 


0.729 




168 


1-16 ] 


0.939 1 


0.826 




171 


1-27 


0.964 


0.777 




178 


1-21 


67945 


0.82* 




180 


1-27 


0.981 


0.941 




187 


1-28 


0.962 


0.936 




190 


1-19 


0.953 


0.840 




196 


1-22 


0.975 { 


0.916 i 




197 


X-22 ) 0.9*3 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


193 


1-20 


0.935 


0.701 


200 


1-23 


0.977 


0.773 


206 


1-30 


0,984 


0.890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0.974 


0.850 ] 


210 


1-40 


0.940 


0.670 


211 


1-28 


0.971 


0.849 


216 


1-24 


0.986 


0.956 


218 


1-33 


0.961 


0.895 


219 


1-19 


0.970 


0.871 


221 


1-19 


0.904 


0.553 


222 


1-21 


0.917 


0.555 


230 


1-19 


0.991 


0.959 


231 


1-26 


0.953 


0.800 


232 


1-25 


0.988 


0.826 


239 


1-23 




0.828 


240 


1-17 


0.982 


0.955 


241 


1-17 


0.982 


0.955 


24 S 


1-30 


0.970 


0.722 


248 


1-22 


0.976 ~ 


0.935 


249 


1-23 


0.968 


0.940 


252 


1-18 


0.971 


0.923 


261 


1-24 


0.883 


0.587 


265 


1-18 


0.939 


0.868 


272 


1-24 


0.953 


0.739 


283 


1-21 


0.406 1 


0.688 


284 


1-29 


0.997 


0 . 854 


290 


1-31 


0 .986 


0.841 


302 


1-28 


0.980 


0.893 


304 


1-16 


0.907 


0.635 


312 


1-19 


0.993 " 


0.976 


313 


1-17 


0.930 


0.753 


323 


1-22 


0.998 


0.909 


324 


1-17 


0.982 


0.954 


328 


1-19 


0.971 


0.865 


329 


1-22 


0.963 


0.924 


330 


1-33 


0.978 


0.841 


331 


1-24 


0.920 


0 . 712 


332 


1-S4 


0.975 


6.841 


333 


1-19 


0.S84 


0.941 


334 


1-20 


0.899 


0.567 


335 


1-27 


0.942 


0.813 


336" 


1-20 


0.952 


0.850 


337 


1-38 ■ 


0.942 


0.653 


338 


1-27 


0.973 


0.772 


339 


1-36 


0.979 


0.804 


340 


TC-27 


0.888 


0.597 


343 


1-19 


0.971 


0.865 


344 


1-22 


0.994 


0.928 


345 


1-17 


0.966 


0.687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0.963 


0.924 


349 


1-24 


0.982 


0.966 


351 


1-21 


0.918 


0.815 


352 


1-31 


0.988 


0.912 


354 


1-31 


0-«4 


0.839 


355 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


360 


1-27 " " 


0.938 


0.821 


361 


1-25 


0.9S4 


0.674 


362 


1-22 


0.929 


0.788 


363 


x-zi 


0.881 


0.715 


364 


1-33 


0.978 


0.841 


Sbb 


1-33 


0.978 


0.841 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


366 


1-21 


0.916" 


0.820 


367 


1-19 


0.93S 


0.822 


368 


1-29 


0.972 


0.874 


370 


1-24 


0.920 


0.712 


371 


1-24 


0.961 


■ o.m 


372 


1-27 


0.919 


0.768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0.994 


"""0.932 


376 


1-34 


0.987 


0.810 


377 


1-17 


0.995" 


0 . 950 


378 


1-49 


0.971 


0.749 


380 


1-20 


0.968 


0.874 


381 


1-20 


0.928 


0.782 


382 


1-19 " ~ 


0.986 


0.934 


383 


1-28 


0,965 


0.829 


384 


1-39 


0.970 


0.551 


386 


1-24 


0.975 


0.881 


388 


1-30 


0.989 


0.868 


3B9 


1-19 


0.984 


0.941 


390 


1-26 


0.971 


0,782 


392 


1-20 


0.981 


0.900 


393 


1-16 


0.9£8 


0.890 


394 


1-23 


0.937 


0.701 


397 


1-22 


0.9B5 


0.854 


399 


1-46 


0.977 


0.698 


401 


1-20 


0.899 


0.567 


402 


1-22 


0.967 


0.931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0.994 


0.921 


407 


1-35 


0.^87 


0.658 


408 


1-39 


0.976 


0 . 551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0.990 


0.962 


411 


1-38 


0.977 


0.827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0.988 


0.965 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.981 


0.940 


417 




0.941 


0.672 


418 


" 1-20 


0.952 


0.850 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0.889 


0.785 


422 


1-48 


0.982 


0.862 


424 


1-19 


0.979 


0.933 


428 


1-38 


0.942 


0.653 


430 


1-18 


0 . 947 


0.59S 


432 


1-33 


0.957 


0.789 


433 


1-26 


0.979 


0.904 


434 


1-27 


0.962 


0.777 


435 


1-24 


6.998 


0.977 


436 


1-27 


0.973 


0.772 


443 


1-15 


0.966 


0.940 


448 


1-36 


0.979 


0.804 


453 


1-41 


0.958 


0.60$" " 




1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 " ' 


0.925 


0.681 


486 | 


1-27 


0.972 


0.845 


495 


1-24 


0.917 " 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


"507 ' " 


1-17 ' 


0.966 ' " 


0.687 


510 - 


1-2* 


0.936 


0.593 
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SEQ ID NO: 


POSITION OP 
SIGNAL IN AMINO 
ACID SEQUENCE 


Maxs (MAXIMUM 
SCORE) 


Means (MEAN "1 
SCORE) 


511 


1-23 


0.930 


0.593 


S12 


1-23 


0.93 0 


0.593 


S15 


1-18 


0.978 


0.956 


523 


1-19 


0.936 


0.822 


529 


1-22 


0.963 


0.924 


54 S 


1-24 


0.982 


0.966 


550 


1-30 


0.933 


0.713 


552 


1-21 


0.973 


0.912 


554 


1-23 


0.969 


0.784 


571 


1-21 


0.918 


0.815 


574 


1-31 


0.988 


0.912 


580 


1-39 


0.S25 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-2 9 


0.932 


0.6*2 


609 


1-29 


0.932 


0.632 


610 


1-21 


0.990 


0.948 


621 


1-15 


0.594 


0.969 


623 


1-33 


0.935 


0.726 


653 


1-27 


0.93 8 


0.827 


668 


1-22 


0.929 


0.788 


677 


1-16 


0.94B 


0.807 


685 


1-21 


0.881 


0.715 


699 


1-22 


0.975 


0.816 


702 


1-31 


0.968 


0.B98 


707 


1-16 


0.860 


0.562 


713 


1-25 


0.966 


0.743 


718 


1-19 


0.936 


0.822 


719 


1-20 


0.961 


0.824 


729 


1-29 


0.972 


0.874 


735 


1-46 


0.903 


0.598 


746 


1-14 


0.916 


0.730 


747 


1-22 


0.965 


0.87* 


748 


1-29 


0.968 


0.785 


759 


1-24 


6.961 


0.773 


767 


1-27 


0.919 


0.768 


768 


1-33 


0.900 


0.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 ~ 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


0.900 


0.5*8 


820 


1-17 


0.995 


0.950 


827 


1-49 


0.971 


0.749 


848 


1-20 


0.968 


0.874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


0.934 


873 


1-23 


0.948 


0.886 


881 


1-28 


0.965 


0.829 


887 


1-39 


6.970 


0.551 


927 


1-30 


0.989 


0.868 


934 


1-48 


0.988 


0.777 


939 


1-39 j 


0.994 


0.889 


944 


1-26 


6.971 


0.7B2 


950 


1-29 


6.957 


0.845 


963 


1-20 


0 . 981 


0.900 


964 


!_20 


0.886 


0.558 


973 


1-16 


0.968 


0.890 


QOA 


1-34 


0 . 961 


6.744 


981 


1-20 


6 . 953 


0.822 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


0.977 


0 . 698 


1052 


1-18 


0.969 


0.842 


1059 


1-20 


6.927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


0.935 
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POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1075 


1-27 


0 .992 


0.934 


1080 


1-19 


□ . 93 1 


0.829 


1092 


1 -19 


0.991 


0.973 


1094 


— - *i b 


0 .992 


0.653 


10 95 


1-30 


0.974 


0.929 


1105 


1-23 


0.994 


0.921 


1123 


X -43 


0.987 


0.658 " 


1138 


X — JZ 


0.954 


0.613 


1140 


1*J3 


0 .989 


0.789 


1142 




0.897 


0.570 " 


1152 


1"*3 


0.990 


0.962 


1170 




0.977 


0.827 


1176 


x - zu 


0.944 


0.768 


1187 


1-20 


0.988 


0.965 


1189 


1-35 


0.967 


0.839 


1192 


1-46 


0.993 


0.638 


1193 


1-16 


0.925 


0.710 


XX27 / 


1-29 


0.985 


0.853 




1-23 


0.981 


0.940 


1225 


1-29 


0.941 


0.672 




1-19 


0.986 


0.967 




1-29 


0.965 


0.86*1 


1265 


1-22 


0 .889 


0.785 


1266 — 


1-20 


0.944 


0.809 




i pqb '■" " 


1-48 


0.982 


0.662 


XC 7/ 


1-19 


0.979 


0.933 




1-21 


0.984 


0.944 


16^ / 


1-19 


0.984 


0.953 


13^2 


1-3 8 


0.942 


0.653 




1-18 


0.947 


0.595 




1-33 


0.957 


0.789 


iJOU 


1-26 


0.9*39 


0.904 


X j 3 / 


1-27 


0.962 


0.777 


XO 53 


1-23 j 


0.997 


0.960 




1-24 


0.998 


0.977 




1-15 


0.946 


0.845 




1-24 


0.913 


0.588 




1-19 


0.982 


0.929 


1416 


1-12 


0.931 


0.891 




1-30 


0.933 


0.563 


1420 


1 -20 1 


0.881 


0.561 


1421 


1-19 


0.990 


0.968 


1423 


1-17 


0 . 968 


0.863 


1424 


1-21 


0. 885 


0.591 


1425 


X — AH 


0 . 913 


0.588 


1426 


1-24 


0 . 913 


0.588 


1428 




0 . 967 


0.899 


1430 




0 . 977 


0.819 


1431 




0 . 979 


0.923 


1432 


X -JO 


0 . 957 


0.613 


1433 




0. 921 


0.753 


1434 




0 . 983 


0.621 


1435 




0 . 910 


0.631 


1436 


X 


0 .988 


0.868 


14 37 


1-22 


0.998 


0.980 


1442 


1-20 


0.918 


0.753 


1448 


1-12 


0 . 931 


n dqi ■ 

U • O 7X 


1462 


1-18 


0,968 


0.888 


1490 


1-20 


0.881 


0.561 


1518 - 


1-17 


0.968 


0.863 ' 


1525 


1-21 


0.885 


0.591 


1*4 7 


1-28 


0.974 


0.891 


1561 


1-25 


0.967 


0.899 


"1580 ' 


1-17 


3.923 


0.824 


1593 


1-28 


0.979 


0.923 
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SEQ ID NO: "" 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


1596 


1-16 


0.929 


0.709 " " 


1601 


1-3S 


0.957 


0.«ll3 


1606 


1-22 


0.979 


0.831 


1607 


1-20 


0.974 


0.770 


1608 


1-32 


0.921 


0.753 


1614 


1-33 




0 . 829 


l6l£ 


1-20 


0.959 


0.869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


O.910 


0.431 


1636 


1-33 


0.697 


0.591 


"l$39 


l-d2 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


1647 


1-17 


0.923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 14 16234. 1 (%CR%01 !.DOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 




sequence 






application 




1 


1787 


3573 


5359 


784CIP2 1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


2673 


3 


1789 


3 57,5 


5361 


784£iP2 3 


4117 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


5 


1791 


3577 


5363 


784CIP2 5 


5562 


6 


1792 


3578 


5364 


784CIP2 6 


5562 


7 

- 


1793 


3579 


5365 


784CIP2 7 


5562 


8 


1794 


35B0 


5366 


784CIP2_8 


5562 


9 


1795 


3581 


5367 


784CIP2 9 


5563 


10 


1796 


1 3582 


5368 


784CIP2_10 


5564 


11 


1797 


3583 


1 5369 


784CIP2 11 


5565 


12 


1798 


3584 


5370 


784CIP2 12 


S689 


13 


1799 


3585 


5371 


784CIP2_13 


5729 


14 


1800 


3586 


5372 


784CIP2JL4 


5745 


15 


1801 


3587 


5373 


784CIP2 15 


5777 


16 


1802 


3588 


5374 


784CIP2 16 


5777 


17 


1803 


3589 


5375 


784CIP2_17 


5789 


18 


1804 


3590 


5376 


784CIP2 18 


5792 


19 


1805 


3591 


5377 


784CIP2 19 


5804 


20 


1806 


3592 


5378 


784CIP2 20 


*805 


21 


1807 


3593 


5379 


784CIP2_21 


5805 


22 


1808 


3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


24 


1810 


3596 


5382 


784CIP2 24 


5850 


25 


1811 


3597 


5383 


784CIP2 25 


5867 


26 


1812 


3598 


5384 


784CIP2 26 


5973 


27 


1813 


3599 


5385 


784CIP2 27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 


29 


1815 


3601 


5387 


784CIP2 29 


6005 


30 


181S 


3602 


5388 


784CIP2 30 


6007 


31 


1817' 


3603 


5389 


784CIP2_31 


6007 


32 


1818 


3604 


53S0 


784CIP2_32 


6009 


33 


1819 


3605 


53S1 


784CIP2 33 


£012 


34 


1820 


3606 


5392 


784CIP2_34 


6015 


35 


1821 


3G07 


5393 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2_36 


6016 


37 [ 


1823 


3609 


5395 


7B4CIP2 37 


6018 


J o 


1824 


3610 


5396 


784CIP2_38 


£018 


3 9 


182S 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


784CIP2 40 


6023 


H. 1 


1827 


3613 


5399 


784CIP2 41 


6070 


4« 


1828 


3614 


5400 


7B4CIP2 42 


6081 




1829 


3615 


5401 


784CIP2 43 


6089 


44 


1830 


3616 


5402 


784CIP2 44 


6118 




1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 


3618 


5404 


784CIP2 46 


6130 


* / 


1833 


3619 


5405 


784CIP2_47 


6177 


48 


1834 


7 an 

JDZU 


5406 


784CIP2 46 


6189 


49 


1835 


3621 


5407 


784C1P2_49 


6191 


50 


1836 


3622 


5408 


784C1P2" 50 


£204 


51 


1837 


3623 


5409 


784CIP2 51 - 


6204 


52 


1838 


3624 


5410 j 


784CIP2_52 


6284 " 


53 


1639 


3625 


5411 


784CIP2 53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


£436 


55 


1841 


3627 


5413 


784CIP2 55 


6442 


56 


1842 


3628 


5414 


784CIP2 56 


6445 


57 


1843 


3629 


5415 i 


784CIP2 57 


6457 


58 


1844 


3630 


S416 


784CIP2 58 


6458 


59 — 


1845 


3631 


1 5417 


784CIP2 59 


£458 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


" SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: " 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


GO 


1846 


3632 


5418 


784CIP2 60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1848 


3634 


5420 


784CIP2 6*2 


6499 


63 


1849 


3635 


5421 


784CXP2 63 


6499 


64 


| 1850 


3636 


5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


784CIP2 65 


6534 


66 


1852 


3638 


5424 


784CIP2 66 


4534 


67 


1853 


3639 


542* 


784CIP2 67 


6540 


68 


1854 


3640 


5426 


784CIP2 68 


6550 


69 


1855 


3641 


5427 


784CIP2_69 


6550 


70 


1856 


3642 


5428 


784CIP2 70 


6592 


71 


1857 


3643 


" 5429 


784CIP2 71 


6645 


72 


1958 


3644 


5430 


784C1P2 72 


6671 


73 


1959 


3645 


5431 


784CIP2 73 


6763 


74 


1860 


3646 


5432 


784CIP2J74 


4763 


75 


1861 


3647 


54*3 


784CIP2J75 


6786 


76 


1862 


3648 


5434 


784CIP2 76 


6824 


77 


1B63 


3649 


5435 


784CIP2_77 


6630 


78 


1864 


3650 


543 6 


784CIP2_78 


6831 


79 


1865 


3651 


5437 


784CIP2_79 


6832 


BO 


1846" 


3652 


5438 


784CIP2 80 


6834 


ai 


1867 


3653 


5439 


784C1P2_81 


6834 


82 


1858 


3654 


5440 


784C1P2_82 


6835 


83 


1859 


3655 


5441 


784CIP2_83 




84 


1870 


3656 


544* 


784C1P2 84 


6843 


85 


1871 


3657 


5443 


784CIP2 85 


6859 


86 


1872 


3658 


544 4 


784CIP2_86 


6915 


87 


1873 


3659 


5445 


784CIP2_87 


6932 


88 


1874 


3660 


544 6 


7B4CIP2_88 


6957 


89 


1875 


366"! 


544 7 


784CIP2_89 


6961 


90 


1876 


3662 


5448 


784CIP2_90 


6973 


91 


1877 


3663 


5449 


7B4CIP2_91 


6973 


92 


1878 


3664 


5450 " 


7d4tlP2 93 


7007 


93 


. 1879 


3665 


5451 


784CIP2_94 


7018 


94 


1880 


3666 


5452 


784CIP2_95 


7019 


95 


1881 


3667 


54 53 


784CIP2 96 


7020 


96 


1882 


3668 


5454 


784CIP2 97 


7026 


97 


1883 


366*9 


5455 


784CIP2_98 


7021 


98 


1B84 


" 3670 


54S6 


784CIP2 99 


7023 


99 


1885 


3671 


5457 


7B4CIP2_100 


7027 | 


100 


1886 


3672 


54S8 


784CIP2 101 


7028 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2_103 


7031 


103 


1889 I 


3675 


5461 


784CIP2 104 


7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


105 


1891 


3677 


5463 


784CIP2JL06" 


7035 


1 106 


1892 


3678 


5464 


784CIP2_107 


7036 


107 


1B93 


3679 


5465 


784CIP2_108 


7039 


108 


1894 


3680 


5466 


784CIP2 109 


7043 


109 


1895 


3681 


544*> 


784CIP2 110 


7044 


110 


1896" 


3682 


5468 


784CIP2 111 


7046 


111 


1897 


36B3 


5469 


784CIP2 112 


7054 


j 112 


1898 


3684 


54 70 


784CIP2_113 


7061 


TT5 — - 

113 


1899 


3685 


5471 


784CIP2_114 


ioil 


114 


1900 "" 


• 3664 


5472 


7B4CIP2_115 


7092 


115 


1901 


3687 


5473 


784CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2_117 


7106 | 


117 


1903 


3689 


5475 


7B4CIP2 118 


7107 j 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 


784CIP2 120 


7123 


120 


1906 


3692 


5478 


784CIP2 121 


7142 


121 


1907 


3693 


5479 


784C1P2 122 


7142 



272 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 

i 


122 


1908 


3694 


5480 


784CIP2 123 


7154 


123 


1909 


3695 


5481 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


784CIP2 125 


7169 


125 


1911 


3697 


5483 


7B4CIP2 126 1 


718$ 


126 — 


1912 


3^98 


5484 


784CIP2 127 


7197 


X* f 


1913 


3699 


5485 


704CIP2 128 


7219 


X40 


1914 


'3700 


5466 


784CIP2_129 


7226 


1Z? 
nri 


1915 


3701 


5487 


784CIP2 130 


7229 


J. J u 


1916 


3702 


5488 


784<±iP2 131 


7234 


i J j. 


1917 


3703 


5489 


784CIP2 132 


7235 


132 


1918 


3704 


5490 


784CIP2 133 


7235 


133 


1919 


3705 


5491 


! 784CIP2 134 


7238 


134 

tTc" 


1920 


3706 


5492 


784CIP2 135 


7247 j 


135 


1921 


3707 


5493 


784CIP2_136 


7261 


136 


1922 


3708 


5494 


784CIP2 137 


7262 


137 


1923 


3709 


5495 


784CIP2 138 


7267 


138 


1924 


3710 


549£ 


784CIP2_139 


7272 


139 


1925 


3711 


5497 


784CIP2_140 


7273 


140 


1926 


3712 


5498 . 


784CIP2 141 


7282 


141 


1927 


3713 


5499 


784CIP2 142 


7288 


142 


1928 


3714 


£500 


784tfIP2 143 ■ 


7291 


143 


1929 


3715 


5501 


784CIP2 144 


7293 


144 


1930 


3716 


' 5502 


784CIP2_145 


7294 


145 


1931 


3717 


5503 


7B4CIP2 146 


7299 


146 


1932 


3718 


5504 


784CIP2 14 7 


7300 


i 147 


1933 


3719 


5505 


784CIP2 148 


7312 


148 


1934 


3720 


5506 


784CIP2_149 


7313 


149 


1935 


3721 


5507 


784CIP2 150 


7315 


150 


1936 


3 722 


55C8 


784CIP2__151 


7318 


151 


1937 


3723 


5509 


784CIP2_152 


7321 


152 


1938 


3724 | 


5510 


784CIP2_153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2 155 


7333 


155 


1941 


3727 


5513 


784CIP2_156' 


7350 


156 


1942 


3728 


55x4 


784CIP2 157 


7352 


157 


1943 


3729 


5515 


7Q4CIP2 158 


7384 


158 


1944 


3730 


5514 


784CIP2 159 


7403 


15$ 


1945 


3731 


5517 


784CIP2 160 


7431 


ibu 


1946 


3732 


5518 


784CIP2 16l 


7441 


1ST 

J.O X 


1947 


3733 


5519 


784CIP2_162 


7453 




1948 


3734 


5520 


784CIP2 163 


7467 


163 


1 oa 5 

1949 


3735 


5521 


784CIP7. 164 


7471 


XO** 


1950 


3736 


5522 


784CiP2 1*5 


" 7493 


IDS 


1951 


3737 


5523 


784CIP2 166 


7S02 


i66 


1952 


3738 


5524 


784CIP2 167 


7511 


lo / 


1953 


3739 


5525 


784CIP2 1*8 


7S14 


1 68 


1954 


3740 j 


5526 


784C1P2 16*9 


7526 


16*9 


lore " 


3741 


5527 


784CIP2_170 


7541 


x /u 


1956 


3742 


5528 


784CIP2 171 


7570 


x /l 


1957 


3743 


5529 


V84CIP2 172 


7S78 


1 

X /«S 


1958 


37474 ] 


5530 


784CIP2 173 


7583 " 


X / J 


1959 


3 745 


5531 


784CIP2 174 


7592 


1/4 


1960 


3745 


5532 


784CIP2 17$ 


7601 


175 


1961 


3747 




784CIP2 176 


7602 


176 j 


1962 


3748 


5534 


784C1P2 177" 


7668 


177 


1963 


" 3749 


5535 


784CIP2 178 


" 7615 ' 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3 751 


5537 


784CIP2_181 


7*24 


180 


1966 


3 752 


5^38 


784CIP2__182 


" 76'26 1 


181 


1967 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3 754 


5540 


784CIP2 184 


7641 


183 


1969 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO: 
of full- 
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nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


.SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


184 


1970 


3756 


5542 


j 784CIP2_186 


7641 


185 


1971 


3757 


5543 


784CIP2_187 


7642 


186 


1972 


3758 


5544 


784CIP2_18S 


764$ 


187 


1973 


3759 


5545 


784CIP2 189 


7656 


183 


1974 


3760 


5546 


784CIP2JL90 


7657 


189 


1975 


3761 


5547 


784CIP2 191 


7657 


190 


1976 


3762 


5548 


784CIP2_192 


7662 


191 


1977 


37^3 


5549 


784CIP2 193 


7668 


192 


1978 


3764 


5550 


784CIP2_194 


7673 


193 


1979 


! 3765 


5551 


784CIP2 195 


! 7690 


194 


1980 


3766 


5552 


784CIP2 196 


7700 


195 


1981 


3767 


5553 


784CIP2_197 


: 7709 


196 


1982 


3768 


5554 


784CIP2 198 


7736 


197 


1983 


3769 


5555 


784CIP2_199 


7737 


198 


1984 


3770 


5556 


784CIP2_200 


7744 


199 


1985 


3771 


5557 


784CIP2 201 


7771 


200 


1986 


3772 


5558 


784CIP2_202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


1989 


3775 


5561 


784CIP2_205 


780* 


204 


1990 


3776 


556-2 


784CIP2 206 


7812 


205 


1991 


3777 


5563 


784CIP2 207 


7812 


206 


1992 


3778 


5564 


784CIP2_208 


7818 


207 


1993 


3779 


5565 


784CIP2_209 


7822 


208 


1994 


3780 


5566 


784CIP2 210 


7827 


209 


1995 


3781 


5567 


784CIP2J211 


7830 ~ 


210 


1996 


3782 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2J214 


7840 


212 


199B 


3784 


5576 


784CIP2 21$ 


7858 


213 


1999 


3785 


5571 


784CIP2 216 


7858 


214 


2000 


3786 


5572 


784CIP2 217 


7861 


215 


2001 


3787 


5573 


784CIP2 218 


7866 


216 


2002 


3788 


5574 


7a4t!iP2 219 


786~8 


217 


2003 


3789 


5575 


784CIP2_220 


7896 


218 


2004 


3790 


5576 


7B4CIP2 221 


7898 


219 


2005 


3791 


5577 


7B4CIP2 222 


7900 


220 


2006 


3792 


5578 


784CIP2 223 


7906 


221 


2007 


. 379* " 


5579 


784C*P2 2*4 


7908 


222 


2008 


3794 


5580 


784CIP2 225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796" 


5582 


784CIP2 227 


7932 


225 


2011 


3797 


$563 


784CtP2 228 


7940 


226 


2012 


379B 


5584 


784CIP2 229 


7940 


227 


2013 


3799 


5565 


784CIP2 250 


7984 


228 


2014 


3800 


5586 


784CIP2 231 


7984 


229 


2015 


3801 


5587 


784CIP2 2*2 


8001 


230 


2016 


3802 


5588 


784CIP2_233 \ 


8021 


231 


2017 


3803 


5589 


7B4CIP2_234 j 


8029 | 


232 


2018 ] 


3804 


5590 


7B4CIP2 235 j 


8033 


233 


2019 


3805 


5591 


7B4CIP2 23* 


8040 


234 


2020 


3306 


5592 


784CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3608 


5594 


784CIP2_239 


8096 




2023 


3809 


5595 


784CIP2 240 


8113 


238 f 


2024 


3810 


" $59$- 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2 242 


8132 


240 


2026 


3812 


5598 ■ 


784CIP2 243 


8137 


241 


2027 


3813 


5599 


784CIP2 244 


8137 - 


242 


2028 


3814 


56*00 


784CIP2_245 


8159 


243 


2029 


3815 


5501 


784CIP2_246 


8159 


244 


2030 


3816 


5602 


784CIP2_247 


8161 


245 


2031 


3817 | 5603 


784CIP2 248 


8176 
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of full- 
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SEQ ID 
NO; of 
full- 
length 
peptide 
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" SEQ ID NOT - 
of contig 
nucleotide 
sequence 


SBQ ID 
NO: 
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sequence 
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corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S .N. 
09/488, 725 


246 


2032 


3818 


5604 


784CIP2 249 


8196 


247 


2033 


3819 


5605 


784CXP2 250 


8200 


248 


2034 


3820 


5606 


784CIP2_25l 


8212 


249 


2035 


3821 


5607 


! 7B4CIP2J252 


8220 


250 


2036 


3822 


5608 


784CIP2 253 


8238 


251 


2037 


3823 


5609 " 


784CIP2 254 


8254 


252 


2038 


3824 


5610 


" 784CIP2 S55 


8255 


253 


2039 


3825 


5611 


784CIP2 256 


8288 


2S4 


2040 


3826 


5612 


784CIP2 257 


8296 


255 


2041 


3827 


5613 


784CIP2_258 


8329 


256 


2042 


3828 


5614 


784CIP2_259 


8362 


257 


2043 


3829 


5615 


784CIP2 260 


8429 


258 


2044 


3830 


5616 


784CIP2_261 


8436 ' 


255 


204S 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5616 


. 784CIP2 2*3 


8472 


261 


2047 


3833 


5619 


784CIP2 264 


8502 


262 


2048 


3834 


5620 


784CIP2 265 


8504 


263 


2049 


3835 


5621 


784CIP2 266 


8507 


264 


2050 


3836 


L 5622 


784Ci!&2 2*8 


8509 


265 


2051 


3837 


5*23 


784CIP2 269 


8515 


266 


2052 


3838 


5624 


784CIP2 270 


8519 ~~ 


267 


2053 


3839 


5625 


784CIP2 271 


8530 


268 


2 054 


3840 


5626 


784CIP2 272 


8532 *" 


269 


20*5 


3841 


5621 


784CtP2 i73 


853 2 


270 


2056 


3842 


5626 


784CIP2 274 


3539 ~ 


271 


2057 


3843 


5629 


784CIP2 275 


8541" 


272 


2058 


3844 


5630 


784CIP2 276 


8543 


273 


2059 


3845 


5631 


784CIP2 277 


8593 


274 


2060 


3846 


5632 


784CIP2_278 


8595 


275 


2061 


3847 


5633 


784CIP2 279 


8615 


276 


2062 


3848 


5634 


784CIP2 280 


8620 


277 


2063 [ 


3849 


5635 


784CIP2 281 


8621 


21/8 


20*4 


3856 


563* " 


784CIP2 282 


8623 


27* 


2065 


3851 


5637 


784CIP2 283 


8625 


280 


2066 


3 852 


5638 


784CIP2 284 


8628 


281 


2067 


3853 


5639 


784CIP2 285 


8628 


282 


2068 


3854 ' 


" 5640 


7B4CIP2 28* 


8*29 


283 


2069 


3855 


5641 


784CIP2 287 


8630 


284 


2070 


3856 


5642 


784CIP2 288 


8631 


285 


2071 


3857 


5643 


7B4CIP2 289 


8633 


286 


2072 


3858 


5644 


784dlfc>2_2$0 


8634 


287 


2073 


3859 


5645 


784CIP2 291 


863S 


288 


2074 


3860 


5646 


784CIP2 292 


8636 


289 


2075 


38*1 


~ 5647 


784CIP2_293 


8659 ' J ' 


290 


2076 


3862 


5548 


784CIP2_294 


86*0 


291 


2077 


3863 


" 5*49 


784CIP2_295 


8667 


292 


2078 


3664 


5650 


784CIP2 296 


8667 ] 


293 


2079 


3865 


5651 


784CIP2 297 


8685 


294 


2080 


3866 


5652 


784CIP2 598 


9805 


j 295 


2081 


3867 


5653 


784CIP2 299 


8896 


29* 


2082 


3868 


5654 


784CIP2 300 


8978 


297 


2083 


3869 


5655 


784CIP2 301 


9046 


298 


2084 


3870 


5656 


784CIP2_302 


" 9048 


299 


2085 


3871 


" 5*57 


784CIP2_303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


919S 


301 


2087 


3873 


5659 


784CIP2 305 


9201 


302 


208B 


3874 


5660 


784CIP2 306 


9307 


303 


2089 


38 75 


*661 ■ 


784CIP2 307 


9321 


304 


2090 


3876 


5662 


7B4CIP2 308 


9397 


305 


2091 


3877 


5663 


784CIP2 309 


9405 


306 


2092 


3878 


5664 


784CIP2 310 


9406 


307 


2093 ' 


3879 


5**5 


784C1P2 311 


9422 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 

full- 
length' 
peptide 
sequence 


0«U XL/ IMU ; 

of eonfcia 

nucleotide 

sequence 


SEQ ID 
NO : 

of cent* A a 

peptide 
sequence 


Priority 
docket number_ 
v-ui z. c tap on a j. ny 
SEQ ID NO: in 
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application 


SEQ ID 
NO: in 
U. S . S .N . 
09/488,725 


308 


2094 


3880 


5666 


784CIP2 312 


9494 


309 


2095 


3881 


5*67 


784CIP2 313 


9512 


3l0 


2096 


3882 


5^8' 


784CIP2 314 


9632 


311 


2097 


3883 


5669 


784CIP2 315 


9661 


312 


2098 


3884 


5*70 


784CIP2JJ16 


9664 


313 


2099 


3885 


5671 


784CIP2 317 


9691 


I 314 


2100 


3886 


5*72 


■ 784CIP2_318 ! 5700 


315 


2101 


3887 


5*73 


j 784CIP2 319 


Q71 G 
j l XD 


316 


2102 


3888 


5674 


784CrP2 120 


9791 

9 i £1 


317 


2103 


3889 


5675 


7B4CTP5 121 


Q Q "7 n 


318 


2104 


3890 


5676 




QQ fi7 


319 


2105 


3891 


5677 




9923 


320 


2106 


3892 


5678 


TftilPTOO I^a 


9938 


321 


2107 


3693 


5679 


TftAPTDO 1 "> C 


Sogl 

9964 


322 


2108 


3894 


5680 




10007 


323 


2109 


3895 


5681 


OA / 


10009 


324 


2110 


3896 


5682 


TflAPTDO 1 *> ft 


10046 


325 


2111 


3897 


5683 


~i Q A ^ m 1 1 a 


10156 


326 


2112 


3898 


5*84 


/04\_Xb'x JiU 


10276 


327 


2113 


3899 


5685 


/H4tlP2 331 


10283 


328 


2114 


3900 


5686 


/o4Cxr2i3 1 


152 


329 


2115 


3901 


5687 


/04VAr2o 2 


167 


330 


2116 


3902 


5688 


/H4v*lr«D 3 


205 


331 


2117 


3903 


568$ 


/o4l.xJr4D 4 


210 


332 


2118 


3904 


5690 


/o4V v Xir«.D 3 


225 


333 


2119 


3905 


5691 


7H*rTD9t» C 


22* 


334 


2120 


3906 


5692 


7fldr7D9B "7 
/O^l.Xr'Z.D / 


264 


335 


2121 


3907 






5co 


334 


2122 


3908 


5694 






337 


2123 


3909 


5695 


784CIP2B 10 




338 


2124 


3910 


5696 


784CIP2B 11 




339 


2125 


3911 


5697 


784C3:p2fe 12 


302 


340 


212* 


3912 


S64S 


784GIP2B 13 


311 


341 


2127 


3913 


5699 


784CIP2B 14 


352 


342 


212B 


3914 


5700 


784CIP2B 15 


358 


343 


2129 


3915 


5701 


7B4CIP28_16 


368 


344 


2130 


3916 


5702 


784ClPiB 17 


393 


345 


2131 


3917 


5703 


7B4CIP2B 18 


477 


346 


2132 


3918 


5704 


784CIP2B 19 


508 


347 


2133 


3919 


5705 


784CIP2B 20 " 


508 


348 


2134 


3920 


5706 


784CIP2B 21 


Sis 


349 ■ 


2135 


3921 


5707 


784CIP2B 22 


578 


350 


2136 


3922 


5708 


784CIP2B 23 


588 


351 


2137 


3923 


5709 


784CIP2B 24 


591 - 


352 


2138 


3924 


5710 


784CIP2B 25 


593 


353 


U139 


3925 


5711 


784CIP2B 26 


594 


354 


2140 


3926 


5712 


7B4CIP2B 27 


619 


355 


2141 


3927 j 


5713 


784CIP2B_2B 


620 


356 


2142 


3928 


4714 


784CIP2B 29 


*54 


3S7 


2143 


3929 


5715 


784CIP2B_30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


359 


2145 


3931 


5717 


784CIP2B_32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


3bl 


2147 


3933 


" 5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


784CIP2B_35 


838 


1 363 


2149 


3935 


5721 


784CIP2B 36 


870 


364 


2150 


3936 


5722 


7B4CIP2B i7 


891 


365 


2151 


3937 '" 


5723 


7B4CIP2B 38 


891 


3 66 


2152 " * 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


21S5 


3941 


5727" ' 


784CIP2B 42 


942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number_ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


370 


2156 


3942 


5728 


784CIP2B 43 


958 


371 


2157 


3943 


5729 


784CIP2B 44 


968 


372 


2158 


3944 


5730 


784CIP2B 45" 


992 


373 


2159 


394$ 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 


2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


5734 


| 784CIP2B_49 


1114 


377 


2163 


3949 


5735 


| 784CIP2B_50 " 


1144 


378 


2164 


3950 


5736 


784CIP2B 51 


1262 


379 


2165 


3951 


5737 


784CIP2B_S2 


1318 


380 


2166 


3952 


5738 


784CIP2B S3 


1319 


381 


2167 


3953 


5739 


784CIP2B 44 


1328 


382 


2168 


3954 


574 0 


784CIP2B 55 


1436 


383 


2169 


3955 


5741 


784CIP2B 56 


1464 


384 


2170 


3956 


5742 


784CIP2B 57 


1584 


38S 


2171 


3957 


5743 


784CIP2B 58 


1617 


386 


2172 


3^58 


5744 


784CIP2B_59 


1724 


387 


2173 


3959 


5745 


784CIP2B 60 


1 1728 


388 


2174 


3960 


5746 


784CIP2B 61 


! 1772 


389 


2175 


3961 


5747 


7B4CIPiB_62 


1809 


390 


2176 


34£2 


5748 


784CIP2B 63 


1868 


3$1 


2177 


3963 


5749 


784CIP2B_64 


1898 


392 


2178 


3964 


5750 


784CIP2B 65 


1926 


393 


2179 


3965 


5751 


784CIP2B_66 


19(J5 


394 


2180 


3966 


5752 


784CIP2B_67 


1967 


395 


2181 


3 967 


5753 


784CIP2B_68 


1995 


396 


2182 


3968 


5754 


7B4CIP2B_69 


2005. 


397 


2183 


3969 


5755 


784CIP2B_70 


2027 


398 


2184 


" 3970 


5756 


7B4CIP2B_?1 


2055 


399 


2185 


3971 




784CIP2B_72 


2103 | 


400 


2186 


3972 


5758 


784CIP2B_73 


2106 


401 


2187 


3973 


57S9 


784CIP2B_74 


2166 


402 


2188 


3974 


576b — 


784CtP2B 7"S 


2175 


403 


2189 


3975 ■ 


5761 


784CIP2B 76 


2176 


404 


2190 


3976 


5762 


784CIP2BJ7B 


2236 


405 


2191 


3977 


5763 


784CIP2B 79 


2250 


406 


2192 


3978 


5764 


784CIP2B 80 


2306 . 


407 


2193 


3979 


• 576* 


784 , tlP2B 81 


2323 


408 


2194 


3980 


5766 


7B4CIP2B_82 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784CIP2B 84 


2399 


411 


2197 


3983 


5769 


784CIP2B 8$ 


2411 


412 


2198 


3984 


5770 


784CIP2B_86 


2428 


413 


2199 


3985 


5771 


784CIP2B 87 


" 2430 


414 


2200 


3986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784C1P2B 89 


2447 " 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 


784CIP2B_91 


2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


5777 


7fi4CiP2B 93 


2S12 


420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 




2209 


3995 


5781 


784C1P2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 


425 


2211 


3997 


5783 


784CIP2B 99 


2943 


426 


2212 


3998 


5784 


784CIP2B_100 


3137 


427 


2213 


3999 


5785 


784CIP2B 101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5786 


784CIP2B 104 


3360 


431 


2217 


4003 


5789 j 


784CIP2B 105 


3362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


432 


2218 


4004 


5790 


784CIP2B 106 


3417 


433 


2219 


4005 


5791 


784CIP2B 107 


3418 


434 


2220 


4006 


5792 


784CIP2B_108 


3442 


435 


2221 


4007 


5793 


784CIP2B 109 


3442 


436 


2222 


4008 


5794 


784CIP2B 110 


3444 


437 


2223 


4009 


5795 


784CIP2B 111 


3855 


438 


2224 


4010 


5796 


784CIP2B 112 


3863 


439 


2225 


4011 


5797 


784CIP2B 113 


4090 


440 


2226 


4012 


5798 


784CIP2B_114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B_115 


4142 


442 


2228 


4014 


5800 


7B4CIP2B_116 


4142 


443 


2229 


4015 


5B01 


784CIP2B_117 


4149 


444 


2230 


4016 


5802 


784CIP2BJ118 


4196 


445 


2231 


4017 


5803 


784CIP2B_119 


4202 


446 


2232 


4018 


5804 


784CIP2B 120 


4274 


447 


2233 


4019 


5805 


784CIP2BJL21 


4304 


44 8 


2234 


4020 


5806 


784CIP2B 122 


4306 


449 


2235 


4021 


j 5807 


784CIP2B 123 


4311 


450 


2236 


4022 


5808 


784CIP2B 124 


4321 


451 


2237 


4023 


5809 


784CIP2B_125 


4323 


452 


2238 


4024 


5810 


784CIP2B 126 


4332 


45* 


2239 


4025 


5811 


784CIP2B 127 


4488 


454 


2240 


4026 


5812 


784CIP2B_128 


4588 


1 455 


2241 


4027 


5813 


784CIP2BJ129 


5569 


456 


2242 


4028 


5814 


784CIP2B 130 


5573 


457 


2243 


4029 


5815 


7B4CIP2B 131 


5577 


458 


2244 


4030 


5816 


7B4CIP2B_132 


5579 


459 


2245 


4031 


5817 


784CIP2B_133 


5582 


460 


2246 


4032 


5818 


764CIP2B_134 


5583 


461 


2247 


4033 


5819 


784CIP2B_135 


5584 


462 


2248 


4034 


5820 


784CIP2B 136 


5585 


463 


2249 


4035 


5821 


784CIP2B_137 


5591 


464 


2250 


4036 


5822 


784CIP2B_138 


5593 


465 


2251 


4037 


5*823 


784CIP2B 139 


5594 


466 


2252 


4038 


5824 


784CIP2B_140 


5594 


467 


2253 


4039 


5825 


784CIP2BJL41 


5598 


468 


2254 


4040 


5826 j 


784CIP2B_142 


5602 


469 


2255 


4041 


5827 


784CIP2B_143 


5605 


470 


2256 


4042 


5828 


784CIP2BJU4 


5608 


471 


2257 


4043 


5829 


784CIP2B_145 


5617 


472 


2258 


4044 


5830 


784CIP2B_146 


5620 


473 


2259 


4045 


5831 


784CIP2BJL47 


5622 


474 


2260 


4046 


5832 


784CIP2B_14 8 


5623 


475 


2261 


4047 


5833 


784CIP2B_149 


5G24 


476 


2262 


4048 


5834 


784CIP2B_1S0 


5625 


477 


2263 


4049 


5835 


784CIP2B_151 


5627 


478 


2264 


4050 


5836 


784CIP2B_152 


5628 


479 


• 2265 


4051 


5837 


784CIP2B__153 


5630 


480 


2266 


4052 


5838 


784CIP2B_154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 


4054 


5840 


78dclP2B_15^ 


5641 


• 463 


2269 


4 055 


5841 


784CIP2B 157 


5643 


484 


2270 


4056 


5842 


784CIP2B_158 


5647 | 


485 


2271 


4057 


5843 


7B4CIP2B 159 


5649 


486 


2272 


4058 


5844 


784CIP2B_160 


S658 1 


487 


2273 


4059 


5845 


784CIP2B 161 


5659 


488 


2274 


4060 


5846 


784CIP2B_162 


5667 " 


489 


2275 


4061 


5847 


784CIP2B_163 


56"72 


490 


2276 


4062 


5848 


7B4CIP2B_164 


5674 


_491 


2277 


4063 


5849 


784CIP2B 165 


5678 


492 


2278 


4064 


5BS0 


784CIP2B_166 


5680 


493 


2279 


4065 


5851 


784CIP2B_167 


5664 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


434 


2280 


4066 


5852 


784CIP2B 168 


L 5686 


495 


2281 


4067 


5853 


784CIP2B_169 


5694 


496 


2282 


4068 


5854 


784CIP2BJL70 


5£98 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 1 


498 


2284 


j 4070 


5856 


784CIP2B_172 


S712 


499 


2285 


4071 


5857 


784CIP2B 173 


5719 


500 


2286 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B 175 


5727 


502 


2288 


4074 


5860 


784CIP2B_176 


5730 


503 


2289 


4075 


5861 


784CIP2B 177 


5734 


504 


2290 


I 4076 


5862 


784CIP2B 178 


5738 


505 


2291 


4077 


5863 


784CIP2B_179 " 


5739 


506 


2292 


4078 


5864 


784CIP2B 180 


5740 


507 


2293 


| 4079 


5865 


784CIP2B 181 


5744 


508 


2294 


4080 


5866 


784CIP2B 182 


5748 


1 509 


2295 


4081 


5867 


784CIP2B 183 


5749 


510 


229C 


4082 


5868 


784CIP2B 184 


5750 


511 


2297 


4083 


5869 


784CIP2B_185 


5750 


512 


2298 


4064 


5870 


7B4CIP2B 186 


5750 


513 


2299 


4085 


5871 


784CIP2B 187 


5761 


514 


2300 


4086 


5872 


784CIP2BJL88 


5762 


515 


2301 


4087 


5873 


784CIP2B 189 


5767 


516 


23 02 


4088 


5874 


7B4CIP2B 190 


5773 


517 


2303 


4089 


5875 


784CIP2B_191 


S7B3 


SIB 


2304 


4090 


5876 


7B4CiP2b 192 ~ 


57B4 


519 


2305 


4091 


5877 


7B4CIP2B 193 


5788 


520 


2306 


4092 


5878 


784CIP2BJL94 


5798 


521 


2307 


4093 


5879 


784CIP2B_196 


5807 


522 


2308 


4094 


5880 


784CIP2B_197 


5818 


523 


2309 


"4095 " 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


5882 


784CIP2B_199 


5827 


525 


2311 


4097 


5863 


784CIP2B_200 


5828 


526 


2312 


4098 


5884 


784CIP2B_201 




527 


2313 


4099 


5885 


784CIP2B_202 


5853 


52S 


2314 


4100 


5886 


784CIP2B_203 


5861 


529 


2315 


4101 


5887 


784CIP2B_204 


5864 


530 


2316 


4102 


5888 


7B4CIP2B 20S 


5865 


j 531 


2317 


4103 


5889 


784CIP2B 206 


£871 


532 


2318 


4104 


5890 


7B4CIP2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B 208 


5873 


534 


2320 


4106 


5892 


7 84CIP2B 209 


5875 


535 


2321 


4107 


5893 


7B4CIP2B 210 ~ 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


537 


2323 


4109 


5895 


784CIP2B_212 


5880 


53B 


2324 


4110 


5896 


784CIP2B 213 


5880 


539 


2325 


4111 


5897 


784CIP2B 214 


5880 


540 


2326 


4112 


5898 


784CIP2BJ215 


5880 


541 


2327 


4113 


5899 


784CIP2B_216 


5885 


542 


2328 


4114 


5900 


784CIP2B_217 


5895 


543 


2329 


4115 


5901 


784CIP2BJ218 


5838 


544 


2330 


4116 


5902 


784CIP2B_219 


5902 


545 


2331 


4117 


5903 


7B4CIP2B_220 


5904 


546 


2332 


4118 


5904 


784CIP2B 221 


" 5918 




2333 


4119 


5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B 224 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 " 


4124 


5910 


784CIP2B 227 i 


5946 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SFO Tn NO • 

of con tig 

nucleotide 

sequence 


NO: 

of con tig 

peptide 

sequence 


Priori ty 

corresDondino 
SEQ ID NO: in 
priority 
application 


SEQ ID 
jnu ; in 
U . S . S »N. 
09/488 725 


556 


2342 


4126 


5914 


784CIP2B_232 


5975 


557 


2343 


J 4129 


5915 


784CIP2B 233 


5977 


558 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B_235 


5979 


560 


2346 


4132 


5918 


784CIP2B 236 


5980 


561 


2347 


4133 


5919 


784CIP2B_237 


5988 


5<J2 


2346 


4134 


5920 


784CIP2BJ238 


5989 


563 


2349 


4135 


5921 


784CIP2B_239 


5991 


564 


" 2350 


4136 


5922 


784CIP2B 240 


5997 


565 ■' 


2351 


4137 


5923 


784CIE2B 241 


5998 : 


566 


2352 


4138 


5924 


784CIP2B_242 


6003 


567 


2353 


4139 


5925 


784CIP2B 243 


6004 


566 


2354 


4140 


5526 


784CIP2B 244 


6013 


569 


2355 


4141 


5927 


784CIP2B 245 


6028 


570 


2356 


4142 


5928 


784CIP2B_246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B 248 


6031 


573 


2359 


4145 


5931 


784CIP2B 249 


l 6031 


574 


2346 


4146 


5932 


784CIP2B 250 


6032 


575 


2361 


4147 


5933 


784CIP2B 251 


6037 


576 


2362 


4148 


5934 


784CIP2B 252 


6037 


577 


2363 


4149 


593 5 


784CIP2B 253 


4043 


578 


2364 


4150 


5936 


784clp2£ 254 


6044 


S79 


2365 


4151 


5937 


784CIP2B 255 


6046 


560 


2366 


4152 


5938 


784CIP2B 256 


6048 


581 


2367 


4153 


5939 


784CIP2B 257 


6049 


582 


2368 


4154 


S$4d 


7^4CIP2B 2S8 


465"! 


583 


2369 


4155 


5941 


784CTP2B 259 


6053 


584 


2370 


4156 


5942 


784CIP2B 260 


6060 


585 


2371 


4157 


5943 


784CIP2B 261 


6063 


584 


2372 


415B 


5944 


7fi4ClP2B" 242 


6066 


587 


2373 


4159 J 


5945 


784CIP2B 263 


6067 


588 ' 


2374 


4160 


5946 


784CIP2B 264 


6068 


589 


2375 


4161 


5947 


784CIP2B 265 


6073 


590 


2376 


4162 


5948 


" 784CIP2B 266 


6076 


591 


2377 


4163 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B 268 


6 b 77 


593 


2379 


4165 


5951 


784CIP2B 249 " 


4079 


594 


2380 


4166 


5952 


784CIP2B 270 


6082 " " 


595 


23 81 


4l4V 


5953 


784CIP2B 272 


6088 


594 


2382 


4148 


5954 


784CIP2B 273 


6091 


597 


2303 


4169 


5955 


784CIP2B 274 


6094 


598 


2384 


4170 


5956 


784CIP2B 275 


6101 


599 


2385 


4171 


fi957 


784CIP2B_274 " 


6103 


460 


2386 


4172 


5958 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B 278 


6108 


602 


238B 


4174 


5960 


784CIP2B 279 


6112 


603 


2369 


4175 


594l 


784CIP2B_280 


6121 


464 


2390 


4176 


5962 


784CIP2B_281 


6125 


605 


2391 


4177 


5963 


784CIP2B 282 


6126 


606 


2392 


4178 


5964 


784CIP2B 283 


6128 


607 


2393 


4179 


5965 


784CIP2B 284 


"6129 


408 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5967 


784CIP2BJ286 


6133 


610 


2396 


4182 


5968 


764CIP2B 287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139" 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


764CIP2B 291 


6146 


615 


2401 


4187 


5973 


784CIP2B_292 


6148 


616 


2462 


4188 


5974 


784CIP2B~293 


6149 


617 


2403 . 


4189 


5975 


764CIP2B 294 


6149 
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SEQ ID NO; 

of full- 
length 
nucleotide 


SEQ ID 
NO: of 
full- 
length 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 
peptide 


C Jm <Ja X U Y 

docket number 
corresponding 
SEQ ID NO: in 


NO: in 
U.S. S.N. 
09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 


618 


2404 


4190 


5976 


784CIP2B 295 


6153 "™ 


619 


2405 


4191 


5977 


784CIP2B_296 


6159 


620 


2406 


4192 


5978 


784CIP2B_297 


6164 


621 


2407 


4193 


5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B 300 


6173 


624 


! 2410 


4196 


5982 


784CIP2B_301 


6190 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4198 


5984 


784CIP2B_303 


6196 


627 


2413 


4199 


5985 


784CIP2B_304 


6197 


628 


2414 


4200 


5986 


784CIP2B 305 


6196 


629 


2415 


4201 


5987 


784CIP2B_3 06 


| 6198 


630 


2416 


4202 


5988 


784CIP2B 308 


6214 


631 


2417 


4203 


5989 


784CIP2B__3 09 


6215 


632 


2418 


4204 


5990 


784CIP2B 310 


6219 


633 


2419 


4205 


5991 


784CIP2B 311 


6226 


' 634 


2420 


4206 


5992 


784CIP2B 312 


6229 


635 


2421 


4207 


5993 


784CIP2B_313 


6234 


636 


2422 


4208 


5994 


764CIP2B_314 


6237 


637 


2423 


4209 


5995 


784CIP2B 315 


6238 


638 


2424 


4210 


5996 


784CIP2B 316 


6239 


639 


2425 


4211 


5997 


784CIP2B 317 


6239 


640 


2426 


4212 


5998 


784CIP2B_318 


6239 


641 


2427 


4213 


5999 


784CIP2B_319 


6240 


642 


2428 


4214 


6000 


784CIP2B 320 


6244 


643 


2429 


4215 


6001 


784CIP2B_321 


6245 


644 


2430 


4216 


6002 


784CIP2B_322 


! 6250 


645 


2431 


4217 


6003 


784CIP2B_323 


6252 


646 


2432 ™ 


4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B_325 


6256 


648 


2434 


4220 


6006 


784CIP2B_326 


6260 


649 


2435 


4221 


6007 


784CIP2B 327 


6261 ~ 


650 


5436 


4222" " 


£008 


784C1P2B 358 


£264 


651 


2437 


4223 


6009 


784CIP2B_329 


6265 


652 


2438 


4224 


6010 


784CIP2B 330 


6266 


653 


2439 


4225 


6011 


784CIP2B 331 


6270 


654 


2440 


4226 


6012 


784CIP2B_332 


<5271 


655 


2441 


4227 


6013 


784.CIP2BJJ34 


6274 


656 


2442 


4228 


6014 


784CIP2B 335 


6276 


657 


2443 


4229 


6015 


784CIP2B 336— 


6281 


658 


2444 


4230 


6016 | 


784CIP2B 337" 


6281 


659 


2445 


4231 


6017 


i 784CIP2B 338 


6288 


660 


2446 


4232 


6018 


784CIP2B_339 


6292 


661 


2447 


4233 


6019 


784CIP2B_340 


6294 


662 


2448 


4234 


6020 


784CIP2BJ343 


6312 


663 


2449 


4235 


6021 


784CIP2B 344 


6312 


664 


2450 


4236 


6022 


784CIP2B 345 


6312 


665 


2451 | 


4237 


6023 


784CIP2B 346 


6322 


666 [ 


2452 


4238 


6024 


784CIP2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


™ £329 


6l8 


2454 


4240 


6026 


784CIP2B 350 


6331 


669 


24S5 


4241 


6027 


784CIP2B 351 


6333 


670 


2456 


4242 


6028 


7B4CIP2B 352 


6334 


671 


2457 


4243 


6029 


784CIP2B_353 


6337 


672 


2458 


4244 


6030 


784CIP2B_354 


6339 


673 


24S9 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356~ 


6348 


675 


2461 


4247 


6033 


784C1CP2B 357 


6348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


| 679 


2465 


4251 * 


6037 


- 784CI*2B 361 


6362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO"; 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: iii 
U.S. S.N. 
09/488,725 


680 


2466 


4252 


6038 


784CIP2B 362 


6368 


681 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2468 


4254 


6040 


784CiP2B_3*4 


6371 


683 


2469 


4255 


6041 


784CIP2B_36S 


6376 


684 


2470 


1 42S6 


6042 


784CIP2B 366 


6379 


685 


2471 


4257 


6043 


784CIP2B 367 


63B0 


686 


2472 


4258 


6044 


784CIP2B 368 


6381 


687 


2473 " 


4259 


6045 


784CIP2B 369 


6392 


683 


2474 


4260 


6046 


784CIP2B_370 


6395 


689 


2475 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


6048 


784CIP2B_372 


6400 


691 


2477 


4263 


6049 


784CIP2B 373 


6401 


692 


2478 


4264 


6050 


784CIP2B 374 


6411 


693 


2479 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


784CIP2B_376 


6411 


695 


2481 


4 267 


6053 


784CIP2B 377 


6416 


696 


2482 


4268 


6054 


784CIP2B 378 


6418 


697 


2483 


4269 


6055 


! 7B4CIP2B_379 


6422 


698 


2484 


4270 


6056- 


784CIP2B 380 


6423 


699 


2485 


4271 


605? " 


( 784CIP2B_381 


6426 


700 


2486 


4272 


6058 


784CIP2B 382 


6427 


701 


2487 


4273 


6059 


784CIP2B_383 


6428 


702 


2438 


4274 ' 


6060 


784CIP2B_384 


6429 


703 


2489 


4275 


6061 


784CIP2BJ385 


6430 


704 


2490 


4276 


6062 


784CIP2BJ3 86 


6432 


705 


2491 


4277 


6063 


784CIP2B 387 


6432 


706 


2492 


4278 


6064 


784CIP2B 388 


6438 


707 


2493 


4279 


6065 


784CIP2B 389 


6441 


708 


2494 


4280 


60** 


784CIP2B_390 


6446 


709 


2495 


4281 


6067 


784CIP2B 391 


6454 


710 


2496 ~ 


4282 


6068 


784CIP2B 392 


6459 


711 


2497 


4283 


6069 


784CIP2B 394 


6461 


712 


2498 


4284 


4070 


7fl4tfiP2B_395 


*4*7 


713 


2499 


426S 


*071 


784CIP2B 396 


6468 


714 


2500 


428* 


6072 


784CIP2B_397 


6487 


715 


2501 


4287 


6073 


784CIP2B_398 


6491 


716 


2502 


4288 


6074 


784CIP2B 399 


656* 


717 


2503 


4289 


*07S 


784^iP2B 401 


6514 


718 


2504 


4290 


6076 


784CIP2B 402 


6519 


719 


2505 


4291 


6077 ■ 


784CIP2B_403 


6521 


720 


2506 


4292 


6078 


784CIP2B_4 04 


6532 


721 


2507 


4293 


6079 " 


784CIP2B 405 


653* 


722 


2508 


- 4294 


6080 


784CIP2B 406 


6543 | 


723 


2509 


4295 


6081 


784CIP2B_407 


6544 


724 


2510 


4296 


6082 


764CIP2B 408 


6548 


725 


2511 


4297 


"'■ 6083 " 


784CIP2B 469 


6551 


726 


2512 


4298 


6084 


784CIP2B 410 


6551 


727 


2513 


4299 


6085 


784CIP2B_411 


6552 


728 


2514 


4300 


6086 


784CIP2B 412 


6554 


729 


2515 


4301 


6087 


784Cli>2B_413 


*S56 


730 


2516 


4302 


6088 


784CIP2B_414 


6560 


731 


2517 


4303 


6089 


784CIP2B_415 


6563 


732 


2518 


4304 


6090 


784CIP2B 416 


6564 


733 


2519 


4305 


6091 


784CIP2B_417 


*567 


734 


2520 


4306 


*092 


7 84CIP2B_41B 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


6095 


784CIP2B_421 


6593 


738 


2524 


4310 


6096 


784CIP2B 422 


6595 


739 


25^5 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 


6099 


784CIP2B 425 


6625 
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SEQ ID NO: ' 


SEQ ID 


SEO. ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




742 


2528 


j 4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


6101 


784CIP2B_427 


| 6630 


744 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_429 


££32 


| 746 


2532 


4318 


6104 


784CIP2B_430 


6633 


747 


2533 


4319 


610S 


784CIP2B_431 


6634 


748 


2534 


4320 


6106 


784CIP2B_432 


6638 


749 


2535 


4321 


6107 


784CIP2B 433 


££41 


750 


2536 


4322 


6108 


784CIP2B 434 


6644 


751 


2537 


4323 


6109 


784CIP2B 435 


6646 


752 


2538 


4324 


6110 


784CIP2B_436 


6646 


753 


2539 


4325 


6111 


784CIP2B_437 


6652 


754 


2540 


4326 


6112 


784CIP2B_438 


6654 


755 


2541 


4327 


6113 


784CIP2B_439 


6657 


756 


2542 


4328 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


7B4CIP2B_441 


6663 


7S8 


2544 


4330 


6116 


784CIP2B 442 


6664 


•759 


2545 


4331 


6117 


784CIP2B_443 


6668 


760 


2546 


4332 


6118 


7B4CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B 445 


6673 


"" 762 


2548 


4334 


6120 


784CIP2B 446 


6685 


763 


254£ 


4335 


6121 


784CIP2B 447 


6667 


764 


2550 


4336 


6122 


784CIP2B 448 


66S9 


765 


2551 


4337 


6123 


784CIP2B_449 


6693 


766 


2552 


4338 


6124 


784CIP2B_450 


£698 


767 


2553 


4339 


6125 


784CIP2B_451 


6699 


7(58" . 


2554 


4340 


6126 


784CIP2B_452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 - 


2556 


4342 


6128 


784CIP2B_454 


6713 


771 


2557 " 


" 4343 


" £129 


784CIP2B_455 


6716 


772 


2558 


4344 


£130 


784CIP2B 456 


6725 


773 


2559 


4345 


6131 


784CIP2B 457 


6726 


774 


2560 


4346 


6132 


784CIP2B -458 


6727 


775 


2561 


4347 


6133 


"784CIP2B 459 " 


£736 


776 


25*2 


4*48 


6134 


784CIP2B 460 


6730 


777 


2563 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 


4350 


6136 


784CIP2B 462 


6732 


779 


2565 


4351 


6137 


784CIP2B 463 


6733 


780 


2566 


4352 


"6138 


784C:fP2b 464 


6737 


781 


2567 


4353 


6139 


784CIP2B 465 


6745 


782 


2568 


4354 


6140 


784CIP2B 4*6 


£751 ■ 


783 


2569 


4355 


6141 


784CIP2B_467 " 


" 6754 


784 


2570 


4356 


6142 


784CIP2B 468 


£758 


785 


2571 


4357 


6143 


784CIP2B 469 


6761 


786 


2572 


4358 


6144 


784CIP2B 470 


6765 


787 


2573 


4359 


6145 


784CIP2B_471 


6768 


788 


2574 


4360 


6146 


784CIP2B_472 


6773 


789 


2*75 


4361 


6147 


784CIP2B 473 


6776 


790 


2576 


4362 


6148 


784CTP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_475 


5798 


792 


.2578 


4364 


6150 


784CIP2B 476 


6823 


793 


2579 


4365 


6151 


784CIP2B 477 


6825 


7^4 


2580 


4366 


6152 


784CIP2B_478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 j 


6639 


796 


2562 


4368 


6154 


784CIP2B_480 | 


6844 


797 


2583 


4i« 


6155 


784CIP2B_4 82 


6849 


798 


2584 


4370 


6156 


784CTP2B_483 


6854 ! 


799 


2585 


4371 


6157 


784CIP2B_484 


6857 


800 


2566 


4372 


615B 


784CIP2B 485 


6861 1 


801 


2587 


4373 


6159 


784CIP2B 48* 


-6873 " 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 ■ 


803 


2589 


4375 


£l6l 


784CIP2B r 48B 


6877 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


" SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


804 


2590 


4376 


6162 


784CIP2B_489 


6880 


805 


2591 


4377 


6163 


784CIP2B 490 


6885 


806 


2592 


4378 


6164 


784CIP2B 491 


6890 


807 


2593 


4379 


6165 


784CIP2B 492 


46$6 


808 


2594 


4380 


" 6164 1 


784CIP2B 493 


6894 


809 


2595 


4381 


6167 


784CIP2B 494 


6901 


810 


2596 


4382 


6168 


784CIP2B 495 


! 6904 


; 8ii 


2597 


4383 


6169 


784CIP2B 496 


6907 


i 812 


2598 


4384 


6170 


784CIP2B 4$7 


6914 


813 


2599 


4385 


6171 


784CIP2B_498 


6917 


814 - 


2600 


43B6 


6172 


784CIP2B_499 


6923 


815 


2601 


4387 


6173 


784CIP2B_500 " 


6929 


816 


2602 


4388 


6174 


784CIP2B 501 


6931 ~ 


817 


2603 


4389 


6175 


784CIP2B 502 ' 


6935 


818 


2604 


4390 


6176 


7B4CIP2B 503 


6940 


819 


2605 


4391 


6177 


764CIP2B 504 


6945 


820 


2606 


4392 


6178 


784CiP2B__^05^ 


6946 


821 


2607 


4393 


6179 


784CIP2B_506 


6947 | 


822 


2608 


4394 


6180 


784CIP2B 507 


6949 


823 


2609 


4395 


6181 


784CIP2B 508 


6959 


824 


2610 


4396 ■ 


6182 


784CIP2B 50$ 


6960 


825 


2611 


4397 


£183 


784CIP2B 510 


6962 


826 


2612 


4398 


6184 


784CIP2B_511 


6963 


827 


2613 


4399 


6185 


784CIP2B_512 


6967 


828 


2614 


4400 


6186 


784CIP2B 513 


6983 


829 


2615 


4401 


61B7 


784CIP2B 514 


6988 


830 


"'■ 2616 


4402 


6138 


784CIP2B 515 


6996 


831 


24l7 


4403 


6189 


784CIP2B 516 


7003 


832 


2618 


4404 


6190 


784CIP2B 517 


7016 


833 


2619 


4405 


6191 


784CIP2B 518 


7017 


834 


2620 


4406 


4192 


784CIP2B_519 


7025 


835 


2621 


4407 


6193 


784CIP2B 520 


7025 


836 


2622 


4408 


6194 


784CIP2B 521 ~ 


7025 


837 


2623 


4409 


6195 


784CIP2B 522 


70^6 


833 


2624 


4410 


4l94 


" 784CIP2B 523 


7051 


839 


2*25 


4411 


6197 


784CIP2B 524 


7055 


840 


2626 


4412 


6198 


784CIP2B 525 


7060 


841 


2627 


4413 


6199 


784CIP2B 526 


7064 


842 


2628 


4414 


6200 . 


784eiP2B 527 " 


7067 


843 


2629 


4415 


6201 


784CIP2B__S28 


7071 


844 


2430 


4416 


6202 


784CIP2B 529 


7072 


845 


2631 


4417 


6203 


784CIP2B 530 


7073 


846 


2632 


4418 


6204 


784CIP2B 531 


7074 


847 


2633 


4419 


6205 


784CIP2B 532 


7088 


848 


2634 


4420 


6206 


784CIP2B 533 


7089 


849 


2635 


4421 


6207 


784CIP2B 534 


7091 


850 


2636 


4422 


6208 


784CIP2B £3£ 


7091 


851 


2637 


4423 


" 4209 


784CIP2B 536 


7104 


852 


26*38 


4424 


6210 


784CIP2B_537 


7105 


853 


2639 


4425 


621* 


784CIP2B_538 


7105 


854 


2540 


4426 


6212 


784CIP2B_539 


7l69 


855 


2641 


4427 


6213 


784CIP2BJ540 


7109 ; 


856 


2642 


4428 


6214 


784CIP2B_541 


7119 


857 


2643 


4429 


6215 


784CIP2B 542 


7120 


f 858 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 ' 


6217 


784CIP2B 544 


7126 


860 


2644 


4432 


6218 


784CIP2B_545 


7127 


8dl 


2647 


4433 


6219 


784CIP2B__546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435" 


6221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 ■ 


2651 " " 


4437" " 


6223 


784CIP2B 550 


7163 
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SEQ ID NO: 


SEQ ID 


S3Q ID NO: 


SEQ ID 




SEQ ID 


Of full- 


NO: of 


of con tig 


NO: 


docket number 


NO : in 


length 


full- 


nucleotide 


of con tig 


cor re s ponding 


U.S ,S . N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 




sequence 






application 




866 


2652 


4438 


6224 


784CIP2B_551 


7175 


867 


2653 


4439 


6225 


784CIP2B_552 


7188 


868 


2654 


4440 


6226 


784CIP2B_553 


7189 


869 


2655 


4441 


6227 


784CIP2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B 555 


i 7191 


871 


2657 


4443 


6229 


784CIP2B SS* 


7203 


872 


2658 


4444 


6230 


784CIP2B_557 


7204 


873 


2659 


4445 


6231 


784CIP2B 558 


7208 


874 


2660 


4446 


6232 


7B4CIP2B_559 


7209 


875 


I 2661 


4447 


6233 


784CIP2B 560 


7210 


876 


2662 


4448 


6234 


784CIP2B 561 


7216 


877 


2663 


4449 


6235 


784CIP2B_562 


7221 


878 


2664 


4450 


6236 


784CIP2B_563 


7230 


879 


2665 


4451 


6237 


784CIP2B_ - 564 


7237 


880 


2666 


4452 


6238 


784CIP2B 565 


7240 


861 


2667 


4453 


6239 


784CIP2B 566 


7245 


882 


2668 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B 568 


7251 


884 


2670 


443* 


6242 


784CIP2B 5*9 


7255 


885 


2671 


4457 


6243 


784CIP2B 570 


7260 


886 


2672 


4458 


6244 


784CIP2B 571 


7265 


887 


2673 


4459 


6245 


784CIP2B_572 


7268 


888 


2674 


4460 


6246 


784CIP2B 573 


7275 


869 


1 2675 


4461 


6247 


784CIP2B 574 


7279 


890 


2676 


4462 


6248 


784CIP2B 57S 


7283 


891 


2677 


4463 


6249 


784CIF2B 576 


7283 


892 


2678 


4464 


6250 


7B4CIP2B_577 


7^287 


893 


2*79 


4465 


6251 


784CIP2B 578 


7301 


894 


2680 


4466 


6252 


784CIP2B 579 


7308 


895 


2681 


4467 


6253 


784CIP2B„580 


7308 


896 


2682 


4468 


6254 


784CIP2B_581 


7309 


897 


2683 


4469 


6255 


784CIP2B_582 


731$ 


898 


2684 


4470 


6256 


784CIP2B_S83 


7320 


899 


2685 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 " 


6258 


784CIP2B 585 


7326 


901 


2687 


4473 


6259 


784Clt>2B 586 


7334 


902 


2688 


4474 


6260 


784CIP2B_587 


7337 


903 


2689 


4475 


6261 


784CIP2B_58B 


" 733 9 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 | 


4477 


•6263 


784CIP2B 590 


7355 


96* 


2692 


4478 


*2*4 


784CIP2B 591 


73*3 


907 


2693 


4479 


6265 


784CIP2B_592 


7363 


908 


.2694 


4480 


6266 


784CIP2B_593 


7365 


909 


2695 


4481 


6267 


784CIP2B 594 


7368 


910 


2*96 


4482 


6268 


784<^£>2B_;*95 


7369 


911 


2697 


44 83 


6269 


784CIP2B_596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B_*01 


7383 


915 


2701 


4487 


6273 


784CIP2B 602 


7387 


916 


2702 


4488 


6274 


784CIP2B 603 


7391 


917 


2703 


4489 


6275 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B 605 


739$ 


919 


2705 


4491 


6277 


7B4CIP2B 606 


7397 


920 


2706 


4492 


6278 


784CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 


922 


2708 


4494 


6280 


784CIP2B 609 


7406 


923 


2709 


4495 


6281 


7B4CIP2B_610 


7406 


924 


2710 


4496 


6282 


784CIP2B 611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4496 


6284 


784CIP2B 613 


7411 


927 " 


2713 


4499 


6285 


784CIP2B 614 


7417 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


1 SKQ ID 
NO: Of 

full- 
length 
peptide 
sequence 


S3<$ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priori fcv " " 
docket number* 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U.S. S.N. 
09/488, 725 


928 


2714 


4500 


6286 


784CIP2B_615 


7418 


929 


2715" 


4501 


62B7 


784CIP23_6l6 


7421 


930 


2716 


4502 


6288 


784CIP2B 617 


7422 


931 


2717 


4503 


6289 


784CIP23_6l8 


7422 


932 


2718 


4504 


6290 


784CIP2B 619 


7423 ! 


93 3 


2719 


4505 


6291 


784CIP23_620 


7424 


934 


2720 


4506 


6292 


784CIP2B 621 


7426 


935 


2721 


4507 


6293 


784CIP2B 622 


7427 


936 


2722 


4508 


6294 


784CIP23 *23 


742B 


937 


2723 


4509 


6295 


784CIP2B 624 


7430 


938 


2724 


4510 


6296 


784CIP23 *25 


i ~ 743 s 


939 


272S 


4511 


6297 


784CIP2B 626 


7437 


940 


2726 


4512 


6298 


764CIP2B_627 


7439 


941 


2727 


4513 


6299 


784CIP2B 626 


7440 


942 


2728 


4514 


6300 


784CIP23 *29 


7442 


943 


2729 


4515 


6301 


784CIP2B 630 


74 50 


944 


2730 


4516 


6302 


784CIP23 631 


7451 


' 945 


2731 


4517 


6303 


784CIP2B_632 


7452 


946 


2732 


4518 


6304 


784CIP23 *33 


7454 


947 


2733 


4519 


6305 


784CIP2B_634 


7457 


948 


2734 


4520 


6306 


784CIP2B_635 


7459 


949 - 


2735 


4521 


6307 


784CIP2B_636 


7461 


950 


i 2736 


4522 


6308 


784CIP2B 637 


7463 


951 


2737 


4523 


6309 


784CIP2B 638 


"74"** 


952 " 


2738 


4524 


6310 


784CIP2B 639 


7469 


953 


2739 


4525 


6311 


784CIP2B 640 


7473 


954 


2740 


4526 


6312 


784CIP2B_641 


7481 


955 ■' " 


2741 


4527 


6313 


784CIP2B 642 


7482 


956 


2742 


4528 


6314 


784CIP23 643 


7482 


957 


2743 


4529 


6315 


784CIP2B_644 


7483 


958 


2744 


4530 


6316 


784CIP2B_645 


7485 { 


959 


2745 


4531 


6317 


784CIP2B_646 




9*0 


2746 


4532 


*318 " 


784CIP2B *47 " 


7487 


961 


2747 


4533 


6319 


784CIP2B_648 


7491 


! 962 


2748 


4534 


6320 


784CIP23_649 


7492 


9*3 


2749 


4535 


6321 


784CIP2B 65^0 ' 


7494 


9*4 


2750 


4536 


*322 


7S4CIP23 *51 


7498 " - 


965 


2751 


4537 


6323 


784CIP2B 652 


7504 


966 


2752 


4538 


6324 


7B4CIP23 653 


7508 


967 


2753 


4539 


6325 


784CIP2B 554 


7516 - 


4*0 


" 2?54 - 


- 4540 


6326 


7d4CiP2B *S5" " 


•*51B" 


969 


2755 


4541 


6327 


784CIP2B_*56 


7519 


970 


2756 


4542 


6328 


784C2P2B 657 


7521 


971 


2757 


4543 


*329 


784CIP23 658 


7529 


972 


2758 


4544 


6330 


784CIP2B 659 


7532 


973 


2759 


4545 


€331 


784CIP23_660 


7533 


974 


2760 


4546 


6332 


784CIP2B 661 


7535 


975 


2761 


4547 


6333 


784CIP2B 662 


7545 


976 


2762 


4548 


6334 


784CIP2B 663 


7546 


977 


27*3 


4549 


6335 


784CIP2B_664 


7552 


978 


2764 


4550 


6336 


784CIP2B 665 


7554 


979 


2765 


4551 


6337 


784CIP2B 666 


7567 


980 


2766 


4552 


*33B 


7B4C^P23 **7 


7569 


9B1 


2767 


4553 


6339 


784CIP2B_668 


7575 


962 


2768 


4S54 


6340 


784CIP23 669 


7576 


983 


2769 


4555 


6341 


784CIP23_670 i 


7577 


984 


2770 


455* 


*342 


784£lP2B *71 


75^9 


985 


2771 


45S7 


6343 


784CtP23 *72 


7582 


986 


2772 


4558 


6344 


784CIP2B 673 


7587 


987 


2773 


4559 


6345 


784C1P23 674 


7589 


988 


2774 


4560 


6346 


"~784£lP2B £1$ ■ 


7597 


989 


2775 


4561 


6347 


784CIP2B *7* 


" 7597 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


S E Q ID 
NO: Of 

full- 
length 
peptide 
sequence 


SEQ ID'NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
cor re sp on d i ng 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


990 


2776 


4562 


6348 


784CIP2B 677 


7609 


991 


2777 


4563 


6349 


784CIP2B 678 


7609 


992 


2778 


4564 


6350 ■ 784CIP2B 67"9 


7609 


993 


2779 


4565 


63£l 


784CIP2B 680 


7613 


994 


2780 


4566 


6352 


784CIP23 681 


7623 


99S 


2781 


4567 


6353 


784CIP23_682 


7629 


996 


2782 


4568 


6354 


784CIP2B 683 


7630 


997 


2783 


4569 


6355 


784CIP2B_684 


7633 


998 


2784 


! 4570 


6356 


784CIP2B_685 


7635 


999 


2785 


4571 


6357 


784CIP2B 686 


7638 


1000 


2786 


4572 


6358 


784CIP2B_687 


7639 


1001 


2787 


45*73 


6359 


784CIP2B 688 


7646 


1002 


2788 


4 574 


6360 


784CIP2B_689 


7647 


1003 


2789 


4575 


6361 


784CIP2B 690 


7648 


1004 


2790 


4576 


6362 


784CIP2B 691 


76S8 


1005 


2791 


4577 


6363 


784CIP2B 692 


7664 


1006 


2792 


4^78 


6364 


784CIP2B 693 


7664 


1007 


2793 


4579 


6365 


784CIP2B 695 


7674 " 


1008 


2794 


4580 


6366 


784CIP2B 696 


767* "■ 


1009 


2795 


4581 


6367 


784CIP2B_697 


7676 


1010 


2796 


i 4582 


6368 


784CIP2B 698 


76B1 


1011 


2797 


4583 


6369 


784CIP2B 699 


7688 


1012 


2798 


4584 


6370 


784CIP2B 700 


7693 


1013 


2799 


4585 


6371 


784CIP2BJ701 


7694 


1014 


2800 


4*86" 


6372 


784CIP2BJ702 


7715 


1015 


2801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2802 


4588 


6374 


784CIP2B 704 


7718 


1017 


2803 


4589 


6375 


784CIP2b_7'05 


7721 


1018 


2804 


4590 


6376 


784CIP2BJ706 


7723 


1019. 


246* 


4591 


6377 


784CIP2B 707 


7729 


1020 


2806 


4592 


6378 


784CIP2B_708 


7733 


1021 


2807 


4593 


6379 


784CIP2B 709 


7735 


1022 


2808 


4594 


6380 


784<?iP2& 710 


7741 


1023 


2809 


4595 


6381 


784CIP2B 711 


7743 


1024 


2810 


4596 


" "6382 


784CIP2B 712 


7748 


1025 


2811 


4597 


6383 


784CIP2B 713 


7749 


1026 


2812 


4598 


6384 


784C*P2B 714 


7750 


1027 


2813 


4599 


6*385 


784CIP2B 715 


7757 


1026 


2814 


4600 


6386 


784CIP2B_716 


7759 


1029 


2815 


4601 


6387 


784CIP2B 717 


" 7760 


1030 


2816 


4602 


6388 


784CIP2& 718 


776TT" 


1031 


2817 


4603 


6"389 


784CIP2B 719 


7764 


1032 


2818 


4604 


6390 


784CIP2B 720 


7765 


1033 


2819 


4605 


6391 


784CIP2B 721 


7766 


1034 


2820 


4606 


6392 


784CIP2B 722 


7767 


1035 


2821 


46"6? 


6393 


784CIP2B 723 


7769 


1036 


2822 


4608 


6394 


784CIP2B 724 


7770 


1037 


2823 


4609 


6395 


7S4CIP2B 725 


7774 


1038 


2824 


4610 


6396 


784CIP2B 726 


7779 


103 9 


2B25. 


4611 


6397 


784C1P2B 727 


7781 


| 104 0 


2826 


4612 


6398 


784CIP2B 728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


7783 


1042 


2826 


4614 


6400 


784CIP2B_730 


7787 


1043 


2829 


4615 


6401 


784CIP2B 731 


7792 


1044 


2830 


4616 


6402 


784CXP2B 732 


7795 


1045 


2831 


4617 


6403 


784CIP2B 733 


7801 


1046 


2832 


4618 


6404 


784di:&2B 734 


7807 


1047 


2833 


4619 


6405 


784CIP2B 735 


7808 ; 


1048 


2834 


4620 


6406 


784CIP2B 736 


7B19 


1049 


2835 


4621 


6407 


784CIP2B 737 


7824 


1050 


2836 


4622 


6408 


784ClP2& 738 


7826 


1051 


2837 


4623 


6409 


784CIP2B 739 


7829 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide . 
sequence 


SEQ tD NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1052 


2838 


4624 


6410 


784CIP2B_740 


7832 




2839 


4625 


6411 


784CIP2B 741 


7839 


1054 


2840 


| 4626 


6412 


784CIP2B 743 


7847 


XU35 


2841 


4627 


6413 


784CIP2B 744 


7848 




2847. 


4628 


^414 


784CIP2BJ745 


7853 




2843 


4629 


6413 


784CIP2B 746 


7854 


1058 


2844 


4630 


6416 


784CIP2B 747 


7856 


1059 


2845 


4631 


6417 


784CIP2B 748 


7862 


1060 


2846 


4632 


6418 


784CIP2B_749 


7865 


1061 


2847 


4633 


6419 


784CIP2B 750 


7874 


1052 


2848 


4634 


6420 


784CIP2B 751 


7877 


1063 


2849 


4635 


6421 


784CIP2B 752 


7880 


1064 


2850 


4636 


6422 


7B4CIP2B 753 


7882 


1065 


2851 


4637 


6423 


784CIP2D 754 


7884 


1066 


2852 


4638 


" 6424 


784CIP2B 755 


7886 


[ 1067 


2853 


4639 


6425 


784CIP2B 756 


7888 


1068 


2854 


j 4640 


6426 


784CIP2BJ757 


7889 


1069 


2855 


4641 


6427 


784CIP2B 758 


7901 


1070 


2856 


4642 


6428 


784CIP2B 759 


7910 


1071 


2857 


4643 


6429 


784CIP2B 760 


7911 "~ 


1072 


2858 


4644 


6430 


784CIP2& 7^1 


7921 


1073 


2859 


4645 


£431 ■- 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


764CIP2B 763 


7924 


1075 


2861 


4647 


6433 


784CIP2& 764 


7925 


1076 


2862 


4648 


6434 


784CIP2B 765 > ' 792B 


1077 


2B63 


4649 


6435 


784ClP2b 7(56 


7929 


1078 


28^4 


4650 


6436 


784CIP2B 767 


7930 — 


1079 


2B65 


4651 


6437 


784CIP2B_768 


7934 


1080 


2866 


4652 


6438 


784CIP2B_7£9 


7938 


1081 


2367 


4653 


" 6439 


784CIP2B 770 


7942 


1082 


2868 


4654 


£44 0 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2B 772 


7946 




2870 


4656 


6442 


784CIP2B 773 


7948 


1085 


2871 I 


4657 


6443 


784C1P2B 774— 


7951 


1086 


2872 


4658 


6444 


784CIP2B 775 


7952 


1087 


2873 f 


4659 


6445 


784CIP2B 776 


7953 ' 


1088 


2874 


4660 


6446 


784CIP2B 777 


7954 


1089 


2875 


4661 


6447 


784CIP2B 778 


7957 


1090 


2876 


4652 


6448 


784CIP2B 779 


7958 


1091 


2877 


4663 


6449 


784CIP2B 780 


7961 


1092 


287B 


4664 


6450 


784CIP2B 781 


7965 


1093 


2879 


4655 


6451 


7 84CIP2B_782 


7966 


1094 


2880 


4666 


6452 


784CIP2B 783 


7979 


1095 


2881 


4667 


6453 


784CIP2B 7B4 


7986 


1096 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


784CIP2B 786 


7988 


1098 


2884 


4670 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


6457 i 


784CIP2B 788 


7992 


1100 


2886 


4672 


6458 


784CIP2B_789 j 


7992 


1101 


2887 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6460 


7B4CIP2B 7dl 


7992 


1103 


2889 


4675 


6461 


784CIP2B 792 


8003 


1104 


2890 


4676 


6462 


784CIP2B 793 


8014 


1105 


2891 


*kO 1 / 


ha <f ■a 
o4 b 3 


784CIP2B_794 


8015 


1106 


2892 


4678 


6464 


784CIP26 795 


8016 


1107 


2893 


4679 


MS 


784CIP2B 796 


8017 


1108 


.2894 — 


4680 


6466 


784CIP2B 797 


8019 


1109 


2895 


4681 


6467 


784CIP2B_798 


8020 


1110 


2896 


4662 


6468 


784CIP2B_799 


8022 


1111 


2897 


4683 


" eT4"69 - " 


7S4CIP2B_800 


8022 


1112 


2898 " 


4684 


6470 


784CIP2B 801 


8028 


1113 


2899 


4*85 ~- 


6471 


784CIP2B 802 


8030 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


" SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
KO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO; in 
priority 
application 


SEQ ID 
NO: in 
a. S.S.N. 
09/488,725 


1114 


2900 


4686 


6472 


784CIP2B_803 


8038 


1115 


2901 


4687 


6473 


784CIP2B 804 


8042 


1116 


2902 


4686 


6474 


784CIP2B_805 


8045 


1117 


2903 


4689 


"™" £475 


784CIP2B 80£ 


8045 


1118 


2904 


4690 


6476 


784CIP2B_807 


8046 


i 1119 


2905 


4691 


6477 


784CIP2B_808 


8047 


1120 


2906 


4692 


6478 


784CIP2B 609 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 " 


1122 


2908 


4694 


6480 


784CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


6482 


784CIP2B 813 " 


8074 


1125 


2911 


4697 


6483 


784CIP2B 814 


8077 


1126 


2912 


4698 


£484 


784CIP2B 815 


8078 


1127 


2913 


4699 


6485 


784CIP2B_816 


8079 


1128 


2914 


4700 


6486 


784CIP2B 817 


8084 


1129 


2915 


4701 


6487 


784C*P2B 818 " 


j 8088 


1130 


29i£ 


4702 


6488 


784CIP2B 819 " 


8090 


1131 


2917 


4703 


6489 


784CIP2B 820 


8091 


1132 


2918 


4704 


6490 


784CIP2B_821 


8099 


i 1133 


2919 


4705 


6491 


784CIP2B 822 


8099 


1134 


2920 


4706 


£492 


784CIP2B 823 


8100 


1135 


2921 


4707 


6493 


784CIP2B 824 


8102 


1136 


2922 


4708 


6494 


784CIP2B_825 


8103 


•1137 


2923 


4709 


| 6495 


784CIP2B 826 


8103 


1136 


2924 


4710 


6*49£ " 


784CIP2B 82? 


8104 


1139 


2925 


4711 


6497 


784CIP2B_B28 


8108 


1140 


2926 


4712 


6498 


784CIP2B 829 


8110 


1141 


2927 


4713 


6499 


784CIP2B_830 


8116 


1142 


2928 


4714 


6500 


784CIP2B 831 


8117 


- 1143 


2929 


4715 


5501 


784CIP2B_832 


8123 


1144 


2930 


4716 


S502 


784CIP2B 833 


8130 


1145 


2931 


4 717 


6503 


784CIP2B_834 


8130 


1146 


2932 


4718 


6504 


784CIP2B 835 


8143 


1147 


2933 


4719 - 


6505 


764CIP2B 836 


8143 


1148 


2934 


4720 


6506 


784CIP2B 837 


8154 


1149 


2935 


4 721 


6507 


784CIP2B 838 


8155 


1150 


2936 


4722 " " 


6508 


784CIP2B 839 


8162 


1151 


. 2937 


4723 


6509 


784CiP2B 840 " 


8l£3 


1152 


2938 


4724 


6510 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CIP2B 842 


8173 j 


1154 


2940 


4726 "" 


6512 


784CIP2B 843 


8179 


1155 


2941 


4727 7 


6513 


784CIP2B 844 " 


6182 


1156 


2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4 729 


6515 


784CIP2B 846 


8164 


1158 


2944 


4730 


6516 


784CIP2B 847 " 


8185 


1159 


2945 


4731 


6517 


784CIP2B 848 


8187 


1160 


2S46 


4732 


£518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784CIP2B 850 


8190 


1162 


2948 


4734 


6520 


784CIP2B 851 


8190 


1163 


2949 


4735 


6521 


784CIP2B 852 " 


8192 


1164 


2950 


4736 


6522 


784CIP2B 853 


6193 


1165 


2951 


4737 


6523 


784CIP2B_854 


8197 


1166 


2952 


4738 


6524 


784CIP2B 855 


8197 


1167 


2953 


Alt Q 


6525 


784CIP2B 856 


8199 


1168 


2954 


4740 


6526 1 


784C1P2B 857' " 


8202 


1169 


2955 


4741 


6527 


784CIP2B 858 


8203 


1170 


2956 


4742 


6528 


7B4CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 j 


784CIP2B 861 


8211 


1173 


2959 


4745 


6531 


784GIP2B 862 


8214 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 


1175 


2961 


4747 


6533 


784CIP2B 864 


8223 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
: OE 
full- 
length 
Dentide 
sequence 


SEQ ID NO: 
oi contig 


SEQ ID 
wu : 


Priority 
docket number 
corresponding 

ni* i 0 it i r* v 
application 


SEQ ID 
NO: in 
U.S .S.N. 

AQ Jii DO nnr 

09/4 Bo f 725 


1176 


.2962 


4748 


6534 


784CIP2B 865 


8224 


1177 


2963 


4749 


6535" 


784CIP2B 86^ — 


8226* • 


1178 


2964 


4750 


6536 


784CIP2B 867 


8227 


1179 


2965 


! 4751 


6537 


784CIP2B 868 


8229 


1180 


2966 


4752 


6538 


784CIP2B 869 


8232 


1181 


2967 


4753 


6539 


784CIP2B 870 


8236 


1182 


2968 


4754 


6540 


784CIP2B 871 


8239 


1103 


2969 


4755 


6541 


784CIP2B 872 


8244 


1184 


2970 


4756 


6542 


784CIP2B 873 


8245 


1185" 


2971 


4757 


6543 


784CTP2R 874 


8248 4 


1186 


2972 


4758 


6544 


784PTP2J3. H7«i 




1187 


2973 


4759 


6545 


/ ot^.xr^o 0/0 


Oa3J 


1188 


2974 


4760 


6546 


/ O^tU-JLJr^o Off 


8260 


1189 


2975 


d 7 CI 


6547 


/osi^j.r'rfio 0/0 


6262 


1190 


2976 


AT (Ti"" 


£T?Z"5 




6268 


1191 


2977 


*k I O J 


CC/ Q 
D 34 J 


TftiPTDTD DQO 


8270 


1192 


297 8 


A 7 CO. 
H fO't 


0 jjU 


/o^V^XirzD oBJL 


8272 


1193 


2979 


a t£H 


CCC1 




8274 


1194 """" 


2980 


4 / bo 


O 33^ 


/o4C.IP2B 883 


8274 


1195 


2981 




odd J 




8275 


1196 


2982 


4768 


6554 


/o4UlP2B odd 


8277 


1197 


2983 






/o^Llr^o ODD 


8281 


1198 


2984 


4770 





/O^Llr^tt DO/ 


8283 


1199 


2985 


4771 


CCC7 
odd / 


■7 Q AHTDTn oqq 


B2B 9 


1200 




4779 


O 33d 


"7 O /l PTTDTO QQQ 


8295 


1201 


2987 


t f f J 






8300 


12 02 


2988 


AiTA 


boo u 


/ o<i y-lirxo 0 y i 


8303 


1203 


2989 


H. t 1 D 


O DO 1 


7H4Clr'^B__o92 


8304 


1204 


2990 


477C 






8305 


1205 


2991 


4777 


6 1 563 


lAACTDOTt ti OA 


oJU j 


1206 


2 992 


4778 


6564 


7fl4f*tP9to aqi; 
/ a VwiJLrzo D73 




1207 


2993 


4779 """ 






8 J 19 


1208 


2994 


4780 


6566 




0 1 


1209 


2995 


4781 


ccfi7 
D jw / 


7fl4PTD7H BOA 


"3*3 


1210 


2996 


4782 


6568 


784CIP2R 899 


8323 


1211 


2997 


4783 


— g-g-^g 


784PTP9R 900 


O J ZD 


1212 


2998 


4784 


6570 


7840TP2n 001 




1213 


2999 


4785 


6571 


784CIP2B 902 


8332 


1214 


3000 


4786 


6572 


784CIP2B 903 


6333 


1215 


3001 


4787 


6573 


784^P2B $04 — 


" - 8335~" 


1216 


3002 


47dd 


6574 


784CIP2B 905 — 


8336 


1217 


3003 


4789" 


6575 


7B4CIP2B 906 


8337 


1218 


3004 


4790 " " 


6576 


784CIP2B 907 


8340 


1219 


3005 


4791 ' 


6577 


784CIP2B 908 


8343 


1226 


3006 


4792 


6578 


' ^ \^ K J \J & 


8347 " 


1221 


3007 


4793*"" " 


6579 " 




8349 


1222 


3008 


'4794 


6580 


TS4CTP2R 911 


8351 


1223 


3009 


4795 


6581 


784CIP2B 912 


8353 


1224 


3010 


4794 


6582 


784f*TP5n 91^ 


8355 


1225 


3011 


4797 


030J 


" 7fi , 4f r T , P2B 9i4" 


OJDX 


1226 


3012 


4798 


6584 




0303 


1227 


3013 


4799 


O DO D 




tiAdn 


1228 


3014 


4800 


6586 


784CIP2B 917 


8369 


1229 


3015 


4801 


6587 


784CIP2B_919 


6375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


6589 


784CIP2B_921 


8391 


123-2 


3018 


4804 


6590 


784CIP2B_922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 


6593 


784CIP2B_925 


B395 


1236 


3022 


4808 


6594 


784CIP2B 926 


8396- - 


1237 


3023 ' 


4809 


6595 


784CIP2B 927 


8398 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket r. umber 
corresponding" 
SEQ ID NO: in 
priority 
application 


seq Id 
NO: in 
U.S. S.N. 
09/488,725 


1238 


3024 


4810 


6596 


784CIP2B 928 


8402 


1239 


3025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 


4812 


6598 


784CIP2B 930 


8405 


1241 


3027 


4813 


6599 


784CIP2B 931 


84 06 


1242 


3028 


4814 


6600 


784CIP2B 932 


j 84 09 


1243 


3029 


4815 


6601 


784CIP2B_933 


8410 


1244 


3030 


4816 


6602 


784CIP2B $34 


8414 . 


1245 


3031 


4817 


6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


6604 


7B4CIP2B_936 


8419 


1247 


3033 


4819 


6605 


784CIP2B 937 


8426 


1248 


3034 


4820 


6606 


784CIP2B 938 


e^o 


1249 


3035 


4821 


6607 


784CIP2B 939 


8431 


1250 


3036 


" 4822 - 


6608 


784CIP2B 940 


8432 


1251 


3037 


4823 


6609 


784CIP2B 941 


8433 


1252 


3038 


| 4824 


6610 


784CIP2B 942 


8434 


1253 


3039 " 


4825 


6611 


784CIP2B 943 


8438 


1254 


3040 


4826 


6612 


784CIP2B 944 


8439 


1255 


3041 


4827 


6613 


784CIP2B_945 


8441 


1256 


3042 


4828 


6614 


784CIP2B 946 


8450 


1257 


3043 


4829 


6615 


784CIP2B 947 


8451 


1258 


3044 


4B30 


6616 


784CIP2B 948 


8452 


1259 


3045 


4831 


6617 


784CIP2B 949 


8460 


1260 


" 3046 


4832 


6618 


784CIP2B 950 


8461 


1261 


3047 


4 833 


6619 ■ 


784CIP2B 9£l 


8462 


12~62 


3048 


4834 


6620 


764CIP2B 952 


8464 ■ 


1263 


3049 


4835 


6621 


784CIP2B 953 


6465 


1264 


" 3050 "" 


4836 


6622 


784CIP2B 954 


8467 


1265 


3051 


4837 


6623 


784CIP2B 955 


8470 


1266 


3052 


■"■ 4 838 


6624 


784CIP2B 956 


8471 


1267 


3053 


4 839 " 


6625 


784CIP2B 957 


8473 


1268 


3054 " 


4840 


6626 


784CIP2B 958 


8474 


1269 


3055 


4841 


6627 - 


784CIP2B 959 


847$ 


1270 


3056 


4642 


6628 


784CIP2B_960 


8476 


1271 


3057 


4843 


6629 


784CIP2B 961 


8480 


1272 


3058 


4644 


6630 


784CIP2B_962 


8482 


1273 


3059 


4645 


6631 


784CIP2B 963 


8482 


1274 


3060 


4846 


6632 


784CIP2B 964 


8486 


1275 


3061 


4847 


6633 


784CIP2B 965 


8488 


1276 


3062 


4848 


6634 


784CIP2B 966 


8492 


1277 


3063 


4849 


6635 


784CIP2B 9-6"7 ' 


8494" 


1278 


3064 


4850 


£636 


784C1P2B 0S8 


84*5 


1279 


3065 


4851 


6637 


784CIP2B_969 


8497 


1280 . 


3066 


4852 


6638 


784CIP2B_970 


8499 


1281 


3067 


4853 


6639 


784CIP2B 971 


8513 


1282 


3068 


4854 


6640 


784CIP2B 972 


8522 


1283 


3069 


48^5 


6641 


784CIP2B 973 


8526 


1284 


3070 


4856 


6642 


784CIP2B 974 


8531 


1285 


3071 


4857 


6643 


784CIP2B_975 


8533 


1286 


3072 


4858 


6644 


784CIP2B 976 


8542 


1287 


3073 


4859 


6645 


784CIP2B 977 


8544 


1288 


3074 


4860 


6646 


784CIP2B 978 


856S 


1289 


3075 


4861 


6647 


784CIP2B 979 


8565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572- 


12^1 


30^7 '" 


4863 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B 982 


8578 


1293 


3079 


4865 


6651 


784CIP2B_983 ! 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


8^98 


1295 


3081 


4867 


6653 


784CIP2B_985 


8602 


~ 1296 


3082 


4868 


6654 


784CIP2B_986 


8604 ~ ' 


1297 


3083 T 


4669 


6655 


784CIP2B 987 


8609 


1298 


3084 


4870 


£656 


784CIP2B 988 


8612 


1299 


3085 — 


4871 " 


6657 


784ttP2B 989 


8637 
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SEQ ID NO; 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; of 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
correspond! ng~ 
SEQ ID NO: in 
priority 
application 


SEQ ib 
NO: in 
U.S. S.N. 
09/488,725 


1300 


3086 


4872 


6658 


7Q4CIP2B 990 


8640 


1301 


3087 


4873 


6659 


784CIP2B 991 


8643 


1302 


3088 


4874 


6660 


784CIP2B 992 


8b45 


1303 


3089 


4875 


6661 


784CIP2B 993 


8650 


13 04 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B 995 


8654 


13 06 


3092 


4878 


6664 


7B4CIP2B 996" 


BtSS 


1307 


3093 


4879 


6665 


784CIP2B 997 


8657 


1308 


3094 


4880 


6666 


7B4CIP2B 998 


8665 


1309 


3095 


4881 


5667 


784CIP2B 999 


8668 


1310 


3096 


4882 


6668 


784CIP2B 1000 


8671 


1311 


3097 


4883 




784CIP2B 1001 


8672 


1312 


3098 


4884 


6670 


784CIP2B 1002 


8692 


1313 


3099 


4885 


6671 


784CIP23 1003 


87C6 


1314 


3100 


4886 


6672 


784CIP23 1004 


8716 


1315 


3101 


4887 


6673 


784CIP2B 1Q05 


\ 8719 


1316 


3102 


48B8 


6674 


784CIP2B 1006 


! 8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 


1318 


3104 


4890 


6676 


784CIP2B 1008 


8764 


1319 


3105 


4891 


*577 


784CIP2B 1009 


1 8764 


li20 


3106 


4892 


6578 


784CIP2B 1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B 1011 


8782 


1322 


3108 


4 894 


6680 


784CIP2B 1012 


8796 


1323 


3109 


4895 


6581 


784CIP2B 1013 


8827 


1324 


3110 - 


489* 


6682 


784CIP2B 1014 


8842 


1325 


3111 


4897 


6683 


784CIP2B_101S 


8842 


1326 


3112 


4898 


6684 


784CIP2B 1016 


8858 


1327 


3113 


4899 


6685 


784CIP2& 1617 


8S71 


1328 


3114 


4906 


668£ 


784CIP2B 1018 


8921 


1329 


3115 


" 4901 


6687 


784CIP2B 1019 


8927 


1330 


3116 


4902 


6688 


7B4CIP2B_1020 


8942 


1331 


3117 


4903 


6689 


7B4CIP2B 1021 


8994 


1332 


3118 


4904 


6*690 " 


784CIP2B 1022 


9023 


1333 


3119 


4905 


6691 


784CIP2B 1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B 1024 


9058 


1335 


3121 


4907 


6693 


784CIP2B 1025 ' 


9058 


1336 


3122 


4908 


6694 


784CIP2B 102(T 


9079 


1337 


3123 


4909 


66*95 


784CIP2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B 1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B 1029 


9084 


1340 


3126 


4912 


6698 


784CIP2B 1030 




1341 


3127 


4913 


66*99 


784CXP2B 1031 


9101 


1342 


3128 


4914 


6700 


784CIP2B 1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B 1033 


9105 


1344 


3130 


4916 


6702 


784CIP2B 1034 


9151 


1345 


3131 


4917 


6703 


784CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B 1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B 1037 


9174 


134 8 


3134 


4920 


6706 


784CIP2B 1038 " 


9204" - 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


784CIP2B 1040 


9235 


1351 


3137 


4923 


6709 


784CIP2B 1041 


9239 


1352 


3138 


4924 


6710 


784CIP2B 1042 


9256 


1353 




4925 


6711 


7B4CtP2B 1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B 1044 


9345 


1355 


3141 


4927 


6713 


784CIP2B 1045 " 


9379 


1356 " 


3142 


4928 


6714 


784CIP2B_1046 


9435 


1357 


3143 


4929 


6715 


784C!lP2B 1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784cirP2B 1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 " 


6719 


784CtP2B 10*1 


9520- 
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SBQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


say Xls 

NO: in 
U.S. S.N. 
09/488,725 


1362 


3148 


4934 


6720 


784CIP2B 1052 


9541 


1363 


3145 


4935 


6721 


784CIP2B 1053 


9541 


1364 


3150 


4936 


6722 


784CIP2B 1054 


9548 


1365 


3151 


4937 


6723 


784CIP2B 1055 


9556 


1366 


3152 


4938 


6724 


784CIP2B 1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B 1057 


9575 


1368 


3154 


4940 


6726 


784CIP2B_1058 


9589 


1369 


3155 


4941 


6727 


784CIP2B 1059 


9599 


1370 


3156 


4942 


6728 


784CI&2B io40 


9602 


1371 


3157 


4943 


6729 


784CIP2B_106l 


9606 


1372 


3158 


j 4944 


6730 


784CIP2B 1062 


9622 


1373 


3159 


4945 


6731 


784CIP2B 106"3 


9623 


1374 


3160 


4946 


6732 


784CIP2B 1064 


9646 


1375 


3161 


4947 


6733 


784CIP2B 1065 


9747 


1376 


3152 


4948 


6734 


784CIP2B 1066 


9773 


13 77 


3163 


4949 


6735 


784CIP2B_1067 


9785 


1378 


3164 


4950 


6736 


784CIP2B 1068 


9801 


137$ 


3165 


4951 


6737 


784CIP2B 1069 


9811 


1380 


3166 


4952 


6738 


784CIP2B 1070 


9843 


1381 


3167 


'4953 


6739 


784CIP2B 1071 


j 9854 


1362 


3168 


4954 


6740 


784CIP2B 1072 


9854 


13 83 


3169 


4955 


4741 


784CIP2B 1073 


9864 


1384 


3170 


4956 


6742 


784CIP2B 1074 


9864 


1385 


3171 


4957 


6743 


784CIP2B 1075 


9871 


1386 


3172 


4958 


6744 


784CIP2B 1076 


9879 


1387 


3173 


4959 


674£ 


784CIP2B 1077 


9881 


1388 


3174 


4960 


674 6 


784CIP2B 1078 


9885 


1389 


3175 


4961 


6747 


784CIP2B 1079 


9901 


1390 


3176 


4962 


674 8 


784CIP2B_1QB0 


9912 


1391 


3177 


49*3 


6749 


784CIP2B 1081 


9916 


1392 


3178 


4964 


6750 


784CIP2B 1082 


9921 


1393 


3179" 


4965 


6751 


784CIP2B 1083 


9925 


1394 


3186 


4966 


6752 - 


784CIP2B^1084 


9930 


1395 


3181 


4967 ■ 


6753 


784CIP2B 1085 


9949 


1396 


3182 


49*8 


67S4 


784CIP2B 1086 


9951 


1397 


3183 


4969 


6755 


784CIP2B 1087 


9B5B 


1398 


3184 


4970 


" 4756 "" 


784CIP2B 1088 


9973 


1399 


3185 


4971 


6757 


7B4CIP2B_1089 " 


9982 


1400 


3186 


4972 


6758 


784CIP2B 1090 


9994 


1401 


3187 


4973 


6759 


7B4CIP2B 1091 


10021 


1402 


3188 


4974 j 


5760 


784CIP2B 1092 


' 16041 


14 03 


3189 


4975 T 


6761 


784CIP2B 1094 


10067 1 


1404 


3190 


4976 


6762 


784dlP2B 1091T- 


10073 


14 05 


3191 


4977 


6763 


784CIP2B 1096 


10112 


1406 


3192 


4978 


6764 


784CIP2B 1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B 1098 


10132 " 


1408 


3194 


4980 


6766 


?84ClP2B 1099 " 


10149 


1409 


3195 


4981 


6767 


784CIP2B 1100 


10217 


1410 


3196 


4982 


6768 


784CIP2B 1101 


10226 


1411 


3197 


4983 


6769 


784CIP2B 1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 


1413 


3199 


4985 


4771 


784CIP2BJ.104 | 


10279 


1414 


3200 


4986 


6772 


784CIP2C 1 


33 


1415 




4987 


6773 


784CIP2C 2 


271 


1416 


3202 


4988 


6774 


784CIP2C_3 


848 — 


1417 


3203 


4989 


6775 


784CIP2C_4 


849 


1418 


3204 


4990 


6776 


784CIP2C_5 


864 


1419 


3205 


4991 


6777 


784CIP2C 6 


953 


1420 


3206 


4992 


6778 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595 


1422 


3208 


4994 


" 6780 


784CIP2C_9 


1697 


1423 


3209 


4995 


6781 


784CIP2C_10 


1744 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


" SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO : in 
U.S. S.N. 
09/488,725 


1 1424 


3216 


4996 


6782 


784CIP2C 11 


1937 


142S 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


! 4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


7B4CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3215 " 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2905 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


7B4CIP2C 21 


2959 


1435 


3221 


5007 


6793 


784CIP2C 22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CI?2C_24 


j 2970 


1438 


3224 


5010 


6796 


| 784CIP2C_25 


2985 


1439 


3225 


5011 


6797 


7B4CIP2C_26 


2987 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


7B4CIP2C_28 


2993 


1442 


3228 


5014 


6800 


784CIP2C_29 


3017 


1443 


3229 


5015 


6801 


784CIP2C_30 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


144$ 


3231 


5017 


6803 


784CZP2C 32 


3357 


1446 


3232 


5018 


6804 


784CIP2C 33 


3359 


1447 


3233 


5019 


6805 


784CIP2C 34 


3432 


144 8 


3234 


5020 


£80* 


784CIP2C_3S 


3438 


1449 


3235 


5021 


6807 


784CIP2C 36 


3439 


1450 


3236 


5022 


6B06 


784CIP2C 39 


34(?3 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6810 


784CIP2C_41 


3466 


1453 


3239 


5025 


6 311 


784CIP2C 42 


3467 


1454 


3240 


5026 


6912 


784CIP2C_43 


3468 


1455 


3241 


5027 


6813 


784CIP2C_44 


3483 


1456 


3242 


5028 


6814 


784CIP2C 45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 46" 


3488 


14 58 


3244 


5030 


6816 


784CIP2C_47 


3491 


1459 


3245 


5031 


6817 


784CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C_4 9 


3494 


1461 


3247 


5033 


6-819 


784CIP2C 50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


" 6821 


784CIP2C 52 


3503 


1464 


3250 


5036 


4822 


784CIP2C_53 


3503 


1465 


3251 


5037 


6823 


784CIP2C 54 


3504 


1466 


3252 


5038 


6824 


784CIP2C 55 


3511 


1467 


3253 


5039 


6825 


784CIP2C_5.6 


3531 


1468 


3254 


5040 


6826 


784CIP2C_57 


3536 


1469 


3255 


5041 


6827 


784CIP2C 58 


3546 


1470 


3256 


5042 


6828 


784CIP2C 59 


3548 


1471 


3257 


5043 


6829 


784CIP2C_60~ 


3551 


1472 


3258 


5044 


6830 


784CIP2C 61 


3553 


1473 


3259 


5045 


6831 


784CIP2C 62 


3564 


1474 


3260 


5046 


6832 


784CIP2C 63 


3567 


1475 ! 


3261 


5047 


6833 


784CIP2C 64 


3572 


1476 


3262 


5048 


6834 


784CIP2C_65 


3573 


1477 


3263 


5049 


6835 


784CIP2C 66 


3574 


1478 


3264 


5050 


6836 


784CIP2C_^7 


3583 


1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3266 


5052 


6638 


784CIP2C 69 


3623 


1481 


3267 


5053 


6839 


784CIP2C_76 


36*29 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CTP2C_72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 ■" 


6843 


784CtP2C 74 


" 3912 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SfeQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Dwi rtyJ Fir 
rilOilty 

docket number 
corre sponding 
SEQ ID NO: in 
priority 
application 


NO: in 

U. S .8 .N. 

09/488,725 


1466 


3272 


5058 


6844 


784CIP2C_75 


I 3924 


14 87 


3273 


5059 


6845 


764CIP2C_76 


3928 


1488 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C_78 


3959 


1490 


3276 


5062 


6848 


784CIP2C 79 


3981 


1491 


3277 


5063 


6849 


784CIP2C_80 


3989 


1492 


3273 


5064 


6850 


784CIP2C_81 


4295 


1493 


3279 


5065 


6851 


784CIP2C_82 


4300 " 


1494 


3280 


5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


6853 


784CIP2C!_84 


4362 


1496 


3282 


5068 


6854 


784CIP2C 85 


4371 


■ 1497 


3283 


5069 


6855 


784CIP2C_86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C_89 


4378 


1500 


3286 


5072 


6858 


784CIP2C 90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


• 1502 


3288 


5074 


6860 


704CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C 93 


4421 


1504 


3290 


5076 


I 6862 


784CIP2C d4 


4426 


1505 


3291 


5077 


6863 


784CIP2C 95 


4430 


1506 


3292 


5078 


6864 


784CIP2C 96 


4435 


1507 


3293 


5079 


6865 


784CIP2C 97 


4436 


1508 


[ 3294 


5080 


6866 


*>&4CiP2C &8 


4439 


1509 


3295 


5081 


6867 


784CIP2C 99 


4440 


1510 


3296 


5082 


6868 


784CIP2C 100 


4441 


1511 


3297 


5083 


6869 


784CIP2C 101 


4442 


1512 


3298 


5084 




784CIP2C 102 


4455 


1513 


3299 


5085 


6971 


784CIP2C 103 


4462 


1514 


3300 


5086 


6872 


784CIP2C 104 


4466 


1515 


" 3301 


5087 


6873 


784CIP2C 105 


4469 


1516 


3302 


5088 


"6374- " 


f&4CIP2C 10* 


4471 


1517 


3303 


5089 


6875 


784CIP2C 107 


4481 


1518 


3304 


5090 


6876 


784CIP2C 108 - 


4483 


1519 


3305 


5091 


6877 


784CIP2C 109 


4484 


1520 


3306 


5092 


" 6878 


784CIP2C lib 


4486 


1521 


3307 


SOW 


6879 


784CIP2C 111 


4490 


1522 


3308 


5094 


6880 


784CIP2C 112 


4499 


1523 


3309 


5095 


6881 


784CIP2C 113 


4503 


1524 


3310 


5096 


6882 


784CIP2C 114 


"450^ 


1*25 


3311 


SO 97 


6883 


784CIP2C 115 


"4509 ' 


1524 


3312 


5098 


6884 


784CIP2C_116 


4514 


1527 


3313 


5099 


6885 


784CIP2CJL17 


4516 


1528 


3314 


5100 


6886 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


7B4CIP2CJL19 


4525 


1530 


3316 


5102 


6888 


784CIP2C 120 


4527 


1531 


3317 


5103 


6889 


784CIP2CJL21 


4528 


1532 


3318 


5104 


6890 


784CIP2CJ.22 


4529 


1533 


3319 


5105 


6891 


784CIP2C_123 


4532 


1534 


3320 


5106 


6892 


784CIP2C 124 


4537 


1535 


3321 


5107 


6893 


784CIP2C_125 


4538 


1536 


3322 


5108 


6894 


784CIP2C_126 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


4552 


1538 


3*24 


5110 


6896 


784C"IP2C_128 


4559 


1539 


3325 


5111 


6897 


784CIP2C_i29 


4567 


1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4S85 


1542 


3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C_135 


4616 


1545 


3331 


5117 


6903 


784CIP2C_136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


" 1547 


3333 


S119 


6905 


784CIP2C 138 


4620 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


" SEQ ID 

NO: of 
full- 
length 
peptide 
sequence 


" S5Q ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID "" 
NO: 

of contig 

peptide 

sequence 


Priority 

dO£*lc^fc mimhor 

4 Aw C **Z3 

SEQ ID NO: in 

priority 

application 


SBQ Td 

Mr\ • jet 
b*k> : in 


1548 


3334 


5120 


6906 


784CIP2C 139 


4624 


1549 


3335 


5121 


6907 


784CIP2C 140 


4632 


1550 


3336 


5122 


6908 


7B4CIP2C_141 


4634 


1551 


3337 


5123 


6909 


7B4CIP2C 142 


4638 


1552 


3338 


5124 


6910 


784CIP2C 143 


4639 


1553 


| 3339 


5125 


6911 


784CIP2CJL44 


4643 


1S54 


3340 


5126 


6912 


784CIP2CJL45 


4644 


1555 


3341 


5127 


6913 


784CIP2C_146 


4655 


1556 


3342 


5128 


6914 


784CIP2C 147 


4668 


1557 


3343 


5129 


6915 


784CIP2C_148 


4677 


1558 


3344 


5130 


6916 


784CIP2CJL49 " 


4677 


1559 


3345 


. 5131 


6917 


784CIP2C_150 


! 4677 


1560 


3346 


5132 


691B 


784CIP2CJL52 


4682 


1561 


3347 


5133 


5919 


784CIP2C 153 


4690 


1562 


3348 


5134 


6920 


~^7B4CIP2C 154 


4691 


1563 


3349 


5135 


L 6921 


784CIP2C_155 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


i 4730 


1S65 


3351 


5137 


6923 


784CIP2C 157 


4734 


1566 


3352 


5138 - 


6924 


*?84ClP2£ 158 


47'ST 


1567 


3353 


5139 


6925 


784CIP2C 159 


4764 


1568 


3354 


5140 


6926 


784CIP2C 160 


4766 


1569 


3355 


5141 


6927 


784CIP2C 151 


4793 


1570 


3356 


5142 


6928 


784CIP2C 162 


4825 


1571 


33*7 


5143 


6929 


784CIP2C 163 


4626 


1572 


3358 


5144 


6930 


784C1P2C 164 


4850 


1573 


3359 


5145 


6931 


784CIP2C 165 


4853 


1574 


3360 


5146 


6932 


784CIP2C 166 " 


4855 


1575 


3361 


5147 


6933 


• 784CIP2C 167 


4856 


1576 


3362 


5148 


6934 


7B4CIP2C 168 


4867 


1577 


3363 


5149 


6935 


784CIP2C 169 


4869 


1578 


3364 


5150 


6936 


784CIP2C 1/0 


4878 


1579 


3365 


5151 - 


6937 


784CIP2C 171 


4880 


1580 


3366 


5152 


6938 


784CIP2C 172 


4942 


1581 


3367 


5153 


6939 


784CIP2C_173 


4945 


1582 


3368 


5154 


6940 


784CIP2C 174 


4950 


1583 


3369 


S155 


■6941 


784CIP2C 175- 


4952 


1584 


3370 


5156 


6942 


784CIP2C 176 


4954 


1585 


3371 


5157 


6943 


784CIP2C 177 


4958 


1586 


3372 


5158 


6944 


784CIP2C 178 


4961 


1587 


3373 


5159 


6945 


784CIP2C 179 


5590 


1586 


3374 


51*6 - 


6946' 


784ctPiG l&Q 


5599 


1589 


3375 


S161 


6947 


784CIP2C_1B1 


5692 


1590 


3376 


5162 


6948 


784CIP2C 182 


5732 


1591 


3377 


5163 


6949 


784CrP2C_183 


5765 


1592 


3378 


5164 


6950 


784CIP2C 184 


5771 


1S93 


3379 


5165 


6951 


784CIP2C 185 


5774 


1594 


3380 


5166 


6952 


784CIP2C 186 


5793 


1595 


33B1 


5167 


6953 


784CIP2C_187 


5806 


1596 


3382 


5168 


6954 


784CIP2C 188 


5852 


1597 


3383 


5169 


6955 


784CIP2C_1B9 


5892 


1598 


3384 


5170 


6956 


784CIP2C_190 


6057 


1599 ] 


3385 


5171 


6957 


7B4CIP2C 191 


6061 


1600 


338* 


5172 


6958 


784C*P2CJL92 


6109 


1601 


3387 


5173 


6959 


784CIP2C 193 


6160 


1602 


3388 


5174 


6960 


784CIP2C 194 


6297 


1603 


3389 


5175 


6961 


7B4CIP2C_195 


6398 


1604 


3390 


5176 


6962 


784CIP2C 196 


6398 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


7B4CIP2C 198 


6448 


1607 


3393 


5179 


6965 


784CIP2C 199 


£469 


1608 


3394 


5180 


" 6 9 - 66 " 


784CIP2C 200 


6474 


1609' 


3395 


5181 


6967 


784CIP2C 201 


6561 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


" SEQ ID 

NO: Of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


NO; 

of contig 

peptide 

sequence 


rtiority 
docJce t n urnfae x 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 

WO .in 

U , S .S .N. 
09/488 72<; 


1610 


3396 


5182 


6968 


784CIP2C_202 


i 6574 


1611 


3397 ■ 


5183 


6969 


784CIP2C 203 


6"578 


1612 


3396 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


5185 


6971 


784CIP2C_205 


6672 


1614 


3400 


5186 


6972 


784CIP2C_206 


6691 ~~ 


1615 


3401 


5187 


6973 


784CIP2C 207 


6^95 


1616 


3402 


5188 


6974 


784CIP2C 208 


6746 


1617 


3403 


5189 


6975 


784CIP2C 209 


6898 


1618 


3404 


5190 


6976 


784CIP2C 210 


6938 


1619 


3405 


5191 


6977 


784CIP2C 211 


6943 


1620 


3406 


5192 


6978 


784CI£>2<2__212 " 


I 7110 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6980 


784CIP2C 214 


7212 


1623 


3409 


5195 


6981 


784CIP2C 215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 2l£ 


7249 


1625 


3411 


S19T" ~" 


j 6983 


i 784CIP2C 217 


7500 


1626 


3412 


5198 


6964 


784CIP2C 218 


7509 


1627 


3413 


5199 


6985 


784CIP2C 219 


7523 


1628 


3414 


5266 


6986 


784CIP2C 220 


7544 


16-29 


3415 


5201 


6987 


784CIP2C 221 


7564 


1630 


3416 


5202 


6988 


784CIP2C 222 


7568 


1631 


3417 


5203 


6989 


784CIP2C 223 


7631 


1632 


3418 


5204 


6990 


784CIP2C 224 


7813 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


" 1634 ' 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C 227 


7907 


1636 


3422 


5208 


6994 


784CIP2C 228 


7943 


1637 


3 423 


5209 


6995 


784CIP2C 229 


8175 


1638 


3424 


5210 


6996 


784CIP2C 230 


8216 


1639 


342S 


5211 


6997 


784CIP2C 231 


8225 


1640 


3426 


5212 


6998 


784CIP2C 232 


8271 


1641 


3427 


fJ2l3 


6999 


' 7B4CIP2C 233 " 


8397 


1642 


3428 " 


" 5214 


7000 


784CIP2C 234 


8466 


| 1643 


3429 


5215 


7001 


784CIP2C 235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


B953 


1645 


3431 


5217 


7003 


784CIP2C 237 


' 9106 


16"46 


3432 


5218"' " 


7004 


" 784CIP2C 238 


9139 


1647 


3433 


5219 


7005 


784CIP2C 239 


955S 


1648 


3434 


5220 


7006 


784CIP2C 24 0 


" 9650 


1649 


3435 


5221 


7007 


~ 784CIP2C 241 


9889 


1650" " 


3436 


5222 


7608 " 


' 784<iiP2d 242 


9933 


1651 


3437 


5223 


7009 


784CIP2C 243 


9953 


1652 


3438 


5224 


7010 


784CIP2C 244 


9981 


1653 


3439 


5225 


7011 


784CIP2D 1 


746 


1654 


3440 


5226 


7012 


784CIP2D 2 


35^58 


1655 


3441 


5227 


7013 


784CIP2D 3 


3553 


j 1656 


3442 


5226 


7014 


784CIP2D_4 


3633 


1657 


3443 


5229 


7015 


784C1P2D 5 


3658 


1658 


3444 


5230 


7016 


784ClP2b_6* 


3732 


1*59 


3445 


5231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 


784CIP2D_B 


4700 


1661 


3447 


5233 


7019 


784CIP2D 9 


4703 


1662 


3448 


" 5234 


7020 i 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


7B4CIP2D 11 


4894 


1664 


3450 


. 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


3452 


5238 


7024 


784CIP2D_14 


" "744" 3 


1667 


3453 


5239 


7025 


784CIP2D_1S 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784dP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID "NO s 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding"" 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO tin 
U.S. S.N. 
09/48B,725 


1672 


3458 


5244 


7030 


784CIP2D 20 


8818 


1673 


3459 


5245 


7031 


784CIP2D 21 


8644 


1674 


3460 


5246 


7032 


784CIP2D 22 


8846 


1675 


3461 


5247 


7033 


784CIP2D 23 


8912 


1676 


3462 


5248 


7034 


784CIP2D 24 


B918 


1677 


3463 


524 9 


7035 


784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


784CIP2D 26 


8941 


1679 


3465 


5251 


7037 


784CIP2D 27 


8941 


1680 


3466 


5252 


7038 


784CIP2D_28 


8951 


1681 


3467 


5253 


7039 


784CIP2D_29 


8951 


1682 


3468 


5254 


7040 


784CIP2D 30 


9007 


1683 


3469 


; 5255 


7041 


784CIP2D 31 


9012 


1684 


3470 


5256 


7042 


784CIP2D 32 


9013 


1685 


3471 


5257 


7043 


784CIP2D_33 


9025 


1686 


3472 


5258 


j 7044 


704CIP2D 34 


9053 


1687 


3473 


5259 


7045 


784CIP2DJ3^ 


9054 


1688 


3474 


52*6 


7046 


784CIP2b 36 


9054 


1689 


3475 


5261 


7047 


704CIP2D 37 


9113 


1690 


3476 


5262 


7048 


784CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D_39 


4l$2 


1692 


3478 


52*4 


7050 


784CIP2D_46 


9152 


1693 


3479 


$2*5 


7051 


784CIP2D 41 


9211 


1694 


3480 


5266 


7052 


784CIP2D_42 


9223 


1695 


3481 


5267 


7053 


784CIP2D 43 


9223 


1696 


3 4 82 


5268 


705'4 


784CIP2D 44 


9231 ~ 


1697 


3483 


5269 


7655 


784CIP2D_45 


9236 


1698 


3484 


5270 


7056 


784CIP2D_46 


9236 


1699 


3465 


5271 


" 7057 


784CIP2D 47 


9303 


1700 


3486 


5272 


7058 


784CIP2D 48 


9309 


1701 


34B7 


5273 


7059 


784CIP2D 49 


9314 


1702 


3488 


5274 


7050 


784CIP2D 50 


9326 ] 


1703 


3489 


5275 


7061 


784CIP2D 51 


9339 


1704 


3490 


5276 


7062 


784CtP2t) 52 


" 9348 


170S 


3491 


5277 


7063 " 


784CIP2D 53 


937* 


1706 


3492 


5278 


7064 


784CIP2D 54 


9382 


1707 


3493 


5279 


7065 


784CIP2D 55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


3495 


. 5281 


7067 


784CIP2D_S7 


9439 


1710 


3496 


5282 


7068 


784CIP2D 58 


9485 


1711 


3497 


5283 


7069 


784CIP2D_59 


9493 


1712 


3498 


5284 


7070 


784CIP2D 60 


9501 


1713 


3499 


5285 


7071 


784CIP2D 61 


952* 


1714 


3500 


5286 


7072 


784CIP2D 62 


9526 


1715 


3501 


5287 


7073 


784CIP2D_63 


9551 


1716 


3502 


5288 


7074 


784CIP2D 64 


9557 


1717 


3503 


5289 


7075 


784CIP2D_65 


9568 


1718 


3504 


5290 


7076 


784CIP2D 66 


9588 


1719 


3505 


5291 


7077 


784CIP2D 67 


9547 ~ 


1720 
— - « 01 — ■- 


3506 


5292 


7078 


784CIP2D 68 


9615 


1721 


3507 


5293 


7079 


784CIP2D_*9 


9628 " 


1722 


3508 


5294 


7080 


784CIP2D 70 


9649 


1723 


3509 


5295 


7081 


784CIP2D_71 


9652 ™" 


1724 


3510 


5296 


7082 


784CIP2D_72 


9660 


1725 


3511 


3^97 


7083 


7B4CIP2D_73 


9662 


1726 


3512 


5298 


7084 


784CIP2D 74 


9725 


1727 


35"13 


5299 


7085 


784CIP2D_75 


9746 


1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


70B7 


784CiP2b 1l 


9787 


1730 


351* 


5362 


7088 


784CIP2D 78 


9790 


1731 


3517 


5303 


7089 


784CIP2D 79 


9842 


1732 


3518 


5304 


7090 


784CIP2D_80 


9842 1 


1733 


3519 


5305 


7091 


784CIP2D 81 


9846 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO; 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corre sponding 
SBO ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1734 


3520 


5306 


7092 


784CIP2D 82 


9867 


1735 


3521 


5307 


7093 


7B4CIP2D 83 


■ 10010 


1736 


3522 


5308 


7094 


784CIP2D 84 


10011 | 


1737 


3523 


5309 


7095 


784CIP2D 85 


100*2 


1738 


3524 


5310 


10§G 


784CIP2D_86 


10057 


1739 


3525 


5311 


7097 


784CIP2D 87 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


7099 


" 7B4CIP2D 90 


10142 


1742 


3528 


5314 


7106 


784CIP2D 92 


10165 


1743 


! 3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


10273 


1746 


3532 


5316 


7104 


784CIP2E 1 


3121 


1747 


3533 


5319 


7105 


784CIP2E 2 


3628 


174 8 


3534 


5320 


7106 


784CIP2E 4 


3673 


; 1749 


3535 


5321 


7107 


7B4CIP2E 5 


4018 


1750 


3536 


5322 


7108 


" 784CIP2E (J 


4467 


1751 


3537 


5323 


7109 


784CIP2E 7 


4865 


1752 


3538 


5324 


7110 


784CIP2E 8 


4916 


1753 


3539 


5325 


7111 


784CIP2E 9 


4923 


1754 


3540 


5326 


7112 


784CIP2E 16 


4926 


1755 


3541 


5327 


7113 


784CIP2B 11 


4962 


1756 


3542 


5328 


7114 


784CIP2E 12 


4963 


1757 


3543 


5329 


7115 


784CIP2E 13 


4964 


1756 


3544 


5330 


7116 


784CIP2E 14 


49B8 


17S9 


3545 


5331 


7117 


784CIP2E 15 


5835 


1760 


3546 


5332 


7118 


784CIP2E 16 


7682 


1761 


3547 


5333 


7119 


784CIP2E 17 


7682 


1762 


3548 


5334 


7120 


784CIP2E 18 


7699 


1763 


3549 


5335 


7121 


784dlP2E 19 ■ 


" 7707 


1764 


3550 


£334 


7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 


7752 


1766 


3552 


5338 


7124 


784CIP2E_22 


8357 


1767 


3553 


5339 


7125 


784CIP2E_23 


9065 


1768 


3554 


5340 


7126 


784CIP2E_24 


9324 


1769 


3555 


5341 


7127 


784CIP2F 1 


2976 


1770 


3556 


5342 


7128 


784CIP2F 2 


3559 


1771 


3557 


5343 


7129 


784CIP2F 3 


4021 


1772 


3558 


5344 


7130 


784CIP2F 4 


4474 


1773 


3559 


5345 


7131 


784CIP2P 5 


4566 


1774 


3560 


5346 


7132 " 


784CIP2F_6 


4705 


1775 


3561 


5347 


7133 


784CIP2F 7 


4707 


1776 


3562 


5348 


7134 


784CIP2F 8 


4712 


1777 


3563 


5349 


7135 


784CIP2F 9 


5006 


1778 


3564 


JJ jU 


7136 


784CIP2F 10 


5009 


1779 


3565 


5351 


7137 


7B4CIP2F 11 


5615 


1780 


3566 


5352 


713B 


7B4CIP2? 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F 13 


7724 


1782 


3568 


5354 


7140 


784CIP2F 14 


7725 


1783 


3569 


5355 


7141 


7B4CIP2F_15 


8d2& 


1784 


3570 


5356 


7142 


784CIP2F 16 i 


" 8830 


1785 


3571 


5357 


7143 


784CIP2F 17 


9739 


17^86 


3572 


5358 


7144 


784CIP2F 18 


9896 



TRADOCS: 14 16247. 1 (%CS701 LDOQ 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G«Glycine, 
H=Histidine, I«Isoleucine, K-Lysine, 
L-Leucine; M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop i 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSAilLDEVAILPAPQNLSVIiSTNMKHLLMWS PVIAPG 
ETVYYSVEYQGEYBSLYTSHIWIPSSWCSLTEGP3CDVTDDITA 
TVPYNLRVRATLGSQTS /CLEHP/VS I PLIBTQPSLPDL/RMEI 
TKDGFHLVIELEDIiGPQFEFLVAYWRRBPGAEEHVKMVR5GGIP 
VHLETME PGAAYC VKAQTFVKA IGRYS AFS QTECVEVQGEAI PL 
VLiALFAFVGFMLILVVVPLFVWKMGRLLQ/YLIiLPRGGSSQTPW 
KITQF 


5360 


2 


1115 


PRVRS SGGQEDPASQQWARPRFTQPSKMRRRVI ARPVGS SVRLK 
CVASGHPRPDI TWMKDDQAI/TRPEAAEPRKKKWTIjSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVI QRTRS KP VLTGTH PVNTTVD 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVCK3QKF 
WL PTG DVWS RPDGS YLNKL LITRARQDDAGMY I CLGANTMG YS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLLWLCQAQ KKPCTPAPAP PLPGHR? PGTARD RS GDKDL P SL 
AALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTGHS 
TPHTYTHPPFSCQLNSSHS 


5361 


3 


925 


HEGS I S SAN I LLDDQFQPKLTDFAMAHFRSHLEHQSCT INMTS S 
SSKHLWYMPEEYIRQGKLSIXTDVYSFGIVIMEVLTGCRWLDD 
PKH I QLRDLLRELME KRGLDSCLS FLDKKV P P CPRNFS AKLFCL 
AGRCMTRAKLRPSMDEVLNTLESTQASLYFABDPPTSLKSFRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
S QSE VM FLS LDKKPES KRNEEACNMPS S S CEES W FPKY I VPS QD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYBQYKKE 


5352 


2 


4879 


SCQ VEGCTRT YtTS S QS IG KHMKTAHPDQ YAA FKMQRKS KKGQ5^A 
NNLKTPNNGKFVYFLPSPVNSSNPFFTSQTKANGNPACSAQLQH 
VS P P I FPAHLAS VSTPLLS SMESVI N PN I TSQDXN3QGGMLCS Q 
MENLPSTALPAQKEDLTKTVLPLNIDRGSDPFLS LPAESSS IDL 
FPS PADSGTNS VFS QLENNTNHYS SQ I EGNTNS S FLKGGNGENA 
VFPSQVNVANKFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKWPAI IRDGKF I CSRCYRAFTN PRS LGGHLS KRS YCKPLDGA 
EIAQEIujQSNGQPSLLASMILSTNAVNLQQPQQSTFNPEACFKD 
P S FLQLLAENRSPAFLPNT FPRSGVTNFNTS VSQEGS E 1 1 IQAI* 
ErAGIPSTFEGAEMLSHVSTGCVSDASQVNATVMPNPTVPPLLH 
TVCHPNTLLTNQNRTSNS KTSS IEECS SLPVFPTNDLLLKTVEN 
GhCSSS FPNSGGPSQNFTSNSSRVSVI SGPQNTRSSHLNKKGNS 
ASKRRKKVAPPI.1APNASQNLVTSDLTTMGLIAKSVE IPTTNLH 
SNVIPTCEPQSLVENLTQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDSQMMAI^SCTTSVNSDLQISEDNVIQNFEKT 
LE 1 1 KTAMNS Q I LE VKSG S QGAGE TS QNAQ I NYNI QL P S VNTVQ 
NNKL PDSS P \ FS S FI SVMP TESNI PQS E\ VSHKEDQ I QE I LEGL 
QKLKLENDLSTPASQCVIilNTS VTLTP TP VKSTADI T VI QP VS2 
MIMIQFNDKVNKPFVCQNQGCNYSAMTKDALFKHYGKIHQYTPE 
MI LE I KKNQLKFAP FKC WPTCTKTFTRNSNLRAHCQLVHHFTT 
EEMVKLKI KRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHSNVAVI PEKQLI EKKS PDKTES SLQ VIT VTS 
E QCNTNALTNTQTKGRKIRRHKKE KEE KKRKKPVS QSLEFPTR Y 
S PYRF YRCVHQGCFAAFTIQQNLI LH YQAVHKSDLPAFSAEVE E 
ESEAGKESEETETKQTLKEFRCQVSDCSRIFQAITGLIQHYMKL 
HEMTPEEIESMTASVDVGKFPCDQLECKSS FTTYLNYWHLEAD 
HGIGIjRASKTEEDGVYKCDCEGCDRIYATRSNIXRHIFNKHNDK 
HKAHL I RPRRLTPGQENMSS KANQEKS KSKHRGTKHS RCGKEGI 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKAPJ^ 
DALSECTSRFVTQYPCMIKGCTSWTSESNIIRHYKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTAT VSQKE VB KNE* DEMDELTEL FITKL INEDSTS VETQA 
NTS SNVSND FQEDNL CQSERQ KAS NLKRVNKEKNVS QNKKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEBHPASFDWSSFKPMGFE 
VS FLKFLEESAVKQKKNTDKD HPNTGNKKGSHS NS RKNIDKT AV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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1 own — 

TO 

NO: 


fceGlcted 
oeyinning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cyeteine, D-Aspartic Acid, E=» 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*Asparagine , 
P^Proline, Q=Glutamine r R^Arginine, 
S«Serine, T=Threonine, VsValine, 
W-Tryptophan, Y^Tyrcsine, X=Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








VIjKQLQEMKPTVSLKKLBVHSNDPDMSVMKDr S IGKATGRGQ Y 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RG P G PGLL L LAVLCLGTAVP S TGAS KS KRQAQQM VQ PQS P VAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGOTYRVGDTYBRPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGD1WRRPHETGGYMLECVCLGNGKGEWT 
CKP IAEKCFDHAAGTS YVVGETWEKP YQGWMMVDCTCLGEGSGR 
I TCTSRNR CNDQDTRTS YRIGDTWS KKDNRGNLLQC I CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DS G WYS VGMQLA* KTQGNKQML \ CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWGGTTQN YDADQKFGFC PMAAHEE I CTTNEGVMYR IGBQW 

dkqhdmghmmrctc vgngrgewtc i aysqlrdqci vdd i tynvn 
dtfhkrheeghmlnctc pgqgrgrwkcdp vdqcqds etgt fyqx 
gdswekyvhgvryqcycygrgigewhcqplotypsssgpvevfi 
tetpsqpns h p xqwnapqpsh i s kyi lrwr pkns vgrwke at i p 
ghlns yti kglkpgwyegqlis iqqyghqevtrfdftttstst 
pvtsnt\vtgettpfsplvatsesvteitassfwswvsasdtv 
sgfrveyelseegdbpqylvlpstatsv\nip\dllpgrkyivn 
vyqisetx3eqslilstsqttapdappdptvdqvddtsivvrwsr 

PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDIjQPGVQYN 

itiyaveenqestpwiqqettgtprsdtvpsprdlqfvevtdv 
kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrntf\aen 

TGLS PGVTYYFKVFAVSHGRESKPLTAQQTTKb\DAPTNLQF VN 

etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpaseytvslvaikgnqespkatgvfttlqpgssippyn 
tevtettivitwtpaprigfklgvrpsqggeaprevtsdsgsiv 
vsgltpgve yvyti qvlrdsqerdap\ i vnk\ vvtpls pptnlh 
leanpdtgvltvswersttpditgyritttptngqqgnsleevv 
hadqs sct f \ dnle vpgle ynvs vytvkddkes vpi sdti i pav 
p pptdlrftn/ ilgpdtmrvtw\appps idltnflvrysp vkne 
grmlqsls 1 fflsdn\awltnllpgteywsvssvyeqhestp 
\lrgrqktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 
tgyrir\hhpehf\sgrpredr\vphsrnsitltnltpgteyw 
sivai,ngrbespiiligqqstvsdvprdl.ewaatptsli,l\swd 
ap avtvryyr itygetggnspvqe ftvpg s ks tati sglkpgvd 
ytitvyavtgrgds pass kpisinyrtei dkpsqmqvtdvqdns 
i s vkwlpss s pvtgyrvttt \ pkngpg\p tktktagpdqtemt i 
eglqpl^eywsvyao^psgesqplvqtavtnidrjkglaftdv 
dvds i ki awes pqgqvs r yrvtys s pedg ih elfpapdgeedta 
elqglrpgseytvswalhddmesqpligtqstaipaptdlkft 
qvtpts ls aqwtppnvqltgyrvrvtpkektgpmke inlapds s 
s vwsglmvatkye vs vyaiikdtltsrpaqgvvttlenvs pprr 
arvtdatett i tis wrtktetitg fqvdavpangqt p iqrttkp 

DVRS YTITGLQPGTD YKI YLYTLNDNARSS PWIDAS TA IDAPS 
NLRFIjATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEAT1TGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPE ILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQ PSVGQQM I FEEHGFRRTTP PTTATPIRHRPRP YPPNVGQE 
ALSOTT I S WAP FODTS E YI ISCHPVGTDR E PT.O FR VPnT<?T cs at 
LTGLTRGATYN I IVEAI»KD0X2RH KVREEVVTVGNS VNEGLNQPT 
DDS C FDPYTVSH YAVGDE WERWS ESGFKLLCQCLGFGSGHFRCD 
S SRWCHDNGVNYKIGE KWDRQGENGQMMS CTCIiGNGKGE FKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNC P I ECFM PLDVQ 
ADREDSRE 


5364 


8066 


703 ■ 


RLCCTGGGEGTPGASGKRGPAATTS LVLC I PS VPPPVPFPTLWP 
P PS WRRQP PGG I RRDFSRRURRBAIJLVATCIj P VRASLPHRLNML 
RGPGPGLLLIiAVLCLGTAVPSTGASKSKRQAQQMVQPQS P VAVS | 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co r r e spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
cor r e spon d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, 0*Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=* Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine r K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P»Proline, Q-Glut amine, R»Arginine, 
SwSerine, T»Threonine, v^Valine, 
W»Tryptophan, Y=Tyrosine, X«Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




♦ 




QSKPGCYDNGKHYQINQQWEiRTYli^KfAliVCTC^GSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEKT 
CKPIAEKC PDHAAGTS YWGETWEKPYQGWMMVDCTCIiGEGSGR 
ITCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNIjIiQC I C TGNGRG 
BWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGKTKQMlA CTCLGNGVS CQETAVTQTYG 
GNSWGEPCVLP PTYNGRTPYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQN YDADQKFGFCPMAAH3E I CTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQIjRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQP3HI S KYI LRWRPKNSVGRWKEATIP 
GHLNS YT I KGLX PG WYEGQLI S I QQ YGHQEVTR FDFTTTS TST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SG FRVEYELSEEGDEPQYLVLPSTATSV\NIP\DI»LPGRKYIVN 
VYQ I S EDGEQS L I LSTSQTTAPDAP PDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPW1QQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTI MWTPPES AVTO YR VDVT PVNIiPGEHGQRLPLS RNTF \ AEN 
TGLS PGVT Y YFKVFAVSHGR ESKPLTAQQTTXIi\DAPTNI*QFVN 
ETDS TVLVRWTP PRAQ I TG YRLTVGLTRRGQ PRQ YNVGP SVS KY 
PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSS I PPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVE YVYTIQVLRDGQERDAP \ I VNK\ WTPLS PPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 
HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDTIIPAV 
PPPTDLRFTN / 1 IX3PDTMRVTW \APPPS IDLTNFLVRYS P VKNE 
GRMLQSLSIFFLSDWVAWLTNLLPGTEYWSVSSVYEQHESrP 
\LRGRQKTGLDSP\TGIDFS \DITA\MSFT\VHW\ IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTH^TPGTEYW 
S1VALNGREESPLL1GQQSTVSDVPRDLEWAATPTSLLI\SWD 
APAVT VRYYR I TYGETGGNS P VQEFTVPGS KS TATI SGLKPGVD 
YTITVYAVTGRGDSPASSKPISINYRTEIDKPSQMQVTDVQDNS 

ISVKWLPSSS pvtgyrvttt\pkngpg\ ptktktagpdqtemti 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGIAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHEIiFPAPDGEEDTA 
B LQGLRPGSEYTVS WALHDDMES QP L I GTQS TAI PAPTDLKFT 
QVTPTSLSAQWTPPNVQliTGYRVRVTPKEKTGPMKE INLAPDSS 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETT I T I S WRTKTET 1 TGFQVDAVPANGQT P I QRT I KP 
DVR3YTITGLQPGTDYKI YIiYTLNDNARSS P WIDAS TAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGIiE PGTEYTI YV IALKNNQKSEPIjIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNG1QLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPWFRVPGTSTSAT 
LTGLTRGATYNIIVEAIjKDQORHKVREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCIiGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 

RPGGEPS PEGTTGQSYNQYSQRYHQRTNTNVNCPI ECFMPLDVQ 
ADREDSRE 


S3^5 




703 


RLCCTGGGEGT PG AS GKRGPAATTSLVLC I PS VPP P VP FPTLW P 
PPSWRRQPPGGIRRDFSRRI.RREANLVATCLPVRASLPHRLNMI1 
RG PG P G LLLLAVL CLGTAVP STG AS KS KRQAQQMVQ P Q S P VAVS 
QS KPG CYDNGKHYQ INQQWB RTYLGNALVCTCYGGSRGFNCE S K 
PEAEBTCFDKYTGNTYRVGDTYERPKDSMI WDCTCIGAGRGRI S 
CTIANRCHEGGQSYKrGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i iv aoxu ft>cgiuciiu cuncdiuing sjl3ti3JL peptide 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N**Asparagine , 
P»Proline, Q=Glutamine, Rt=Arginine, 
S=Serine, ^Threonine, V»Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








ITCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNLLQCI C^GNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSG WYS VGMQLA* KTQGNKQML\CTCIiGNGVS CQE TAVTQT YG 
GWSNGEPCVLPPTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSPCTDHTVIiVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCX3TTQN YDADQKFGFCPMAAHEE I CTTNEGVM YR I GDQW 
DKQH DMGHMMRCTCVGNGRGEWTC I AYSQLRDQCI VDD I TYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTPYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHOQPLQTYPSSSGPVEVFI 
TETPS QPNSHP I QWNAPQ PSH IS KY I LRWRPKNS VGRWKEATI P 
GHLNS YTI KGLXPGWYEGQLIS IQQYGHQEVTRFDFTTTSTST 
PVTSNTWTGETTPFSPLVATSESVTEITASSPWSWVSASDTV 
SGFR VE YELSEEGDEFQ YLVLPS TATS V\NIP \ DLLPGRK YI VJT 
VYQ I S EDGE Q S L I LSTS QTTAPDAP PDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
I TI YA VEBNQES TP WIQQ ETTGTPRSDT VPS PRDLQFVEVTD V 
KVTI MWTPPESAVTG YRVDVT PVNLPGEHGQRLPLSRNTF\AEN 
TGLS PG VTY YF KVFAVS HG RES KP LTAQQTT KL \ DAPTNLQ FVN 
ETDSTVLVRWTP PRAQ I TG YRLTVGLTRRGQ PRQ YNVGP S VS KY 
PLRNLQPAS E YTVSLVAI KGMQES PKATGVFTTLQPGSS IPPYN 
TEVTETTIVITWTPAPRIGFiCLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 
HADQS S CT F \ DN L R VPGLB Y3JVS VYTVKDDKES VPI SDT 1 1 PAV 
PPPTDUIFTN/IIX3PDTMRVTW\APPPSIDLTNFLVRYSPV)QIE 
GRMLQS LS I FFLSDN\AWLTNLLPGT3 YWSVS S VYEQHE STP 
\LRGRQKTCLDSP\1TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNIjTPGTEYW 
SIVALNGREESPLIjIGQQSTVSDVPRDLEWAATPTSLLI\SND 

apavtvryyritygetggnspvqeftvpgskstatisglkpgvd 
yti tvyavtgrgds pass kp i s inyrte idk psqmqvtdvqdns 
isvkwlpssspvtgyrvttt\pkngpg\ptktktagpdqtemti 
eglqptveywsvyaqnpsgesqplvqtavtnidrpkglaftdv 
dvdsikiawespqgqvsryrvtysspedgxhelfpapdgeedta 
el<x3ijipgseytvsvvalhddmesqpligtqstaipaptdi/kft 
qvtptslsaqwtppnvqltgyrvrvtpkektgpmkbinlapdss 
s wvsg iimvat kyevs vyalkdtltsrpaqgwttlenvs pprr 
arvtdatettitiswrtktetitgfqvdavpangqtpiqrtikp 
dvrs yt itglq pgtdyki yl ytlndnarss p wi d astaidap s 
nlr flattpns llvs wqpprari tgy 1 1 kyekpgs p pre wprp 

RPGVTEATITGLEPGTEYTIYVrALKMNOJf^PPT.TfiDTfifTn-iOi o 
QLVTLPHPNLHG PE I LDVPS TVQKTPFVTHPG YDTGNGI QL PGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTI 3 WAP FQDTSE YII S CHPVGTDEEPLQFR VPGTSTS AT 
LTGLTRGATYNI IVEALKDQQRHKVREEVVTVGNSVWEGIjNQPT 
DDSCFDPYTVSK YAVGDEWERMSESGFKIiLCQCLG FGSGHFRCD 
SS RWCHDNG VNYKIGE KWDRQG ENGQMMS CTCI/3NGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
AD REDS RE 




8066 


703 


RIjCCTGGGEGTPGASGKRG PAATTS LVLC I PS VPPPVP FPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RG PGPG LLLIA V LCLGTAVPS TGAS KS KRQAQQMVQ PQS P VAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFMCESK 
PEAEETCFDKYTGNT YRVGDT YERPKDSM I WDCTC IGAGRGR I S 
CTIANRCHEGGaSYKIGDTMRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTS2JYEQDQ 
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SEQ 
ID 

.ttO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid, eecrment rrmha'ininn a 4mai v«i»<t^ifia"" 
(A=Alanine, C*Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lyeine, 
r.=Leucine, M=Methionine, N^Asparagine , 
P»Proline, Q«Glutamine, R*Arginine, 
St-Serine, T-Threonine, V»Valine, 
W«Tryptophan, Y=Tyrosine, X»Uhknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYSFCTDHTVI,VCrrRGGNSNGAiCH^P^LVNWH^YtD<iTSEGRR 
DNMKWCGTTQNYDADQ K FG FCPMAAHE E I CTTNEG VMY R I GDQ W 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTPHKRHEEGHMLNCTCPGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWBKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLN S YTI KGLKPG WYEGQL IS I 0Q YGHQBVTRFDFTTT STS T 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQ I SEDGEQSLIIjSTSQTTAPDAPPDPTVDQVDDTS I wrwsr 
PQAPITGYRIVYSPSVEGSSTELNIiPSTANSVTLSDLQPGVQYN 

itiyaveenqestpwiqqettgtprsdtvpsprdlqfvevtdv 

KVTIMWTPPESAVTGYRVDVIPVNIiPGEHGORLPLSRWTF\AEN 

tglspgvtyyfkvfavshgreskpi»taqqttkl\daptnlqfvn 
etdstvlvrwtppraq i tg yr ltvgltrrgqprqynvgps vsky 
plrnlqpasbytvslvaixgnqespkatgvfttlqpgssippyn 
tbvtettivitl^tpaprigfklgvrpsqggeaprevtsdsgsiv 
vsgltpgve yvytiqvlrdgqerdap \ i vnk \ wtplspptnlh 
leanpdtc3vltvswersttpditgyritttptngqqgnsleevv 
1iadqssctf\dnlevpgleynvsvytvkddkbsvpisdtiipav 
ppptdxirftn/ ilgpdtmrvrw\appps idltnflvrys p vkne 
grmlqsls i fflsdn\awltnllpgteyws vs s vyeqhestp 
\lrgrqktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 
tgyrir\hhpehf\sgrpredr\vphsrnsitltnltpgteyw 
sivalngreesplligqqstvsdvprdlewaatptslli \swd 

APAVTVR YYRITYGETGGNS P VQEFTVPGS KSTATISGLKPGVD 

yti tvyavtgrgdspas s kp i s i n yrte i dk ps qmqvtdvqdns 
isvkwlpssspvtgyrvtttVpkngpg\ptktktagpdqtemti 
eglqptve yws vyaqnpsgesq plvqtavtn i drpkglaftd v 

DVDS I KIAWESPQGQVSRYRVTYSS PEDGIHELFPAP13GEEDTA 
ELQGLRPGSEYTVS WALHDDMESQPLIGTQSTAI PAPTDLKFT 
QVTPTS LSAQWTP PNVQLTG YRVRVTPXEKTG PM KE INLAPDS S 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETTI TI SWRTKTETTTGFQVDAVPANGQTPI QRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATT PNS LLVS WQP PRARITG YI I KYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTI YVT ALinsrwrnr P t » "mo it vttmpt n 

QLVTI»PHPNIiHGPEX£iDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMI FEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTIS WAPFQDTSEYI IS CHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQRHKVREE VVTVGNSVNEGLNQPT 
DDSCFDPYTVSKYAVGDBWERMSESGFKLIiCQCLGFGSGHFRCD 
SSRWCinONGVNYKIQEKWDRCKSENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKE YLGAI CS CTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5367 


235 


3591 


kkii^mlck^ivieyiadXLyeylygfcfsgikkyliihvlrl 
i le lwmtrllle ks vs l qtq ylll ivki ls wfpg kemrhhlq i m 
EVMMRKQDS / rxvgngseqqlqkeladvlmdppmddqpgekelv 

KRSQLDGEGDGPLSNQLSASSTINPVPLVGLOKPEMSIiPVKPGQ 
GDSEASSPFTPVADEDSWFSKLTYLGCASVNAPRSEVEALRMM 
SILRSQCQISLDVTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LYS FATAFRRS AXQTPLS ATAAPQTPDSD I FTFS VS LE I KEDDG 
KGYFSAVPKDKDRQCFXLRQG I DKKIV1 YVQQTTNKEXiAI BRCF 
GliLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVTTGSWNPKSPH 
FQVVNEETPKDKVLFMTTAVDLVITEVQEPVRFLLETKVRVCSP 
NERLFWPFS KRSTTENFFLKLKQI KQRERKNNTDTLYBWCLBS 
ES ERERRKTTASPS VR LPQSGSQS S VI PS P PEDDEE EDNDEPLL 
S GSGDVS KECAEKlLETWGELliS KWHLNLNVR PKQLS SLVRNG V 
PEALRGEWOJjIiAGCHNNDHLVEKYRIIilTKESPQDSAITRDIH 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Ao Alanine, C»Cysteine, D^Aspartic Acid, E=> 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H«Histidine, I-Isoleucine, K»Lysine, 
L-Leucine, M«Methionine, N=Asparagine, 
E "Proline, Q=olut amine. R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








RTF P AHDY FKDTGGDGQD SLYKI C KAYS VYDEE IGYCQGQSFLA 
AVLLLHMPE EQAFSVLVK IMFDYGLRELFKQNFEDLHCKF YQLB 
RLMQE YI PDL YNHFLD I SLEAHM YASQWFLTLFTAKFPLYMVFH 
IIDLLLCEGISVIFNVALGLLKTSKDDLLJ.TDFEGALKFFRVQL 
PKR YRSEENAKKLMBLACNMKI SQKKLKKYEKEYHTMREQQAQQ 
EDP t ERFERENRRLQEANMRLEQENDDLAHELVTS KIALRKDLD 
NAEEKADALNK3LLMTKQXLIDAEEEKRRLEEESAHLKKMCRRE 
LDKABSEIKKNSSriGDYKQrCSQLSERLEKQQTANKYEIEKlR 
QKVDDCBRCRE FFNKEGRVKG 1 SS TKEVLDEDTDEE KETLKNQL 
REMELELAQTKI»\QLVEASCK1QD\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLS S I KTATGVQGKETC 


5368 


573 


2014 


GAAAGAADPRRGSUSGRTMLDFAi FAVTFLLALVGAVLYLYPAS 
RQAAGIPGITPTEEKDGNLPD1VNSGSLHEFLVNLHBRYGPWS 
FWFGRRLWSIX3TVD VLKQHINPNKTLD /ii F *NHAE VI IKVSIW 
WWQCE*KP\QRKKLYBNGVTDSLKSNFALLLKLPEBLLDKWI*SY 
PBTQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSEIGKGFLDGSLDKNMTRKKQYEDALMQIiESVLRNIIKERK 
GRNFSQHIFIOSLVQGNLNDQQIIiEDSMI FSLASCI ITAKLCTW 
AI WFLTTSEEVQKKL YE E I NQVFGNGP VTPEKIEQLRYCQHVLC 
ETVRTAKLTPVSAQLQDI EGKIDRFI IPRETLVL YALGWLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGFSGTQECPELRFAYMVT 
TVLLS VLVKRLHLLSVEGQVI ETKYELVTS SREEAWITVSKRY 


5369 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRR1.LRGTMSASFVPNGASLED 
CHCNli FCLADLTG I KW KKYVWQG PTS AP I LFPVTETEDP ILS SFS 
RCLKADVLG/VWRRDQRPERRE \L * I FWGGEDP\VLLTLFTMTY 
QKKKME CG RMDF PMN AVLC FS KAVHNLLB RCLMNRNFVR I G KWF 
VKP YEKDEKP INKSEHLSCS FTFFLHGDSNVCTS VEINQHQPVY 
LLSEEHITLAQQSNSPFQVILCPFGL.NGTLTGQAFKMSDSATKK 
LIGEWKQFYPISCCLKEMSEEKQEDMDWEDDSliAAVEVLVAGVR 
MI YPACFVLVPQSDI PTPSPVGSTHCS SSCLGVHQVPASTRDPA 
MSSVTLTPPTSPBEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 
GGKI P RKtANHVVDR VWQ ECNMNRAQN KRKYSAS S 3G LCE E ATA 
AKVASWDFVEATQRTNCS CLRHKNLKSRNAGQQGQAPSLGQQQQ 
HjPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDV(^D\ADS\A 

sqrlv\ isap\dsq\vrfsnir\tndvak\tpqmhgtemanspq 
ppplsp\hpcdwdegvtktpstpqsqhfyqmptpdplvpskpm 
edridslsqsfppqyqeaveptvyvgtavnlbedeaniawkyyk 
f pkkkdveflppqlpsdkfkddpvg pfgqesvts vtelmvqckk 
plkvsdblvqqyqiknqclsaiasdaeqepkidpyafvegdebf 
iifpdkkdrqnsereagkkhkvedgtssvtvlsheedamslfsps 
ikqdaprptskarppstsliydsdlavsytdldnlfnsdedelt 
pgskrsangsddkasckesktgnldplscistadlhkmyptpps 
leqkimgfs pmnmnnke ygsmdttpggtvlegnss s igaq fki e 
vdegfcs pkpsbikdfs yvykpencq 1 lvgcsmfaplktlpsqy 

LPLIKLPEECIYRQSWTVGKLELIjSSGPSMPFIKEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPP3NSGAGILPSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGS VKYENSDLYS PASTPSTCRPLNS VEP 
ATVPSIPEAHSLYVNLILSESVMNLFKDCNSDSCCICVCNMNIK 
GADVGVYI P DPTQEAQ YRCTCG FSAVMNRKFGNNSGLFFEDELD 
I IGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDL ILL 
LQDQ CTNLFS rFGAADQDP FPKSGVI SNWVRVEERDCCNDC YLA 
LEHGRQFMDNMSGGKVDEALVKSSCLHPWSKRNDVSMQGSQDIL 
RMLLSLQPVLQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 
TDESPEPLPIPTFLLGYDYDYLVLSP FALPYWERLMLEPYGSQR 
DIAYWLCPENEALLNGAKS FFRDLTAI YESCRLGQHRPVSRLL 
TDG I MR VGS TAS KKLS EKLVAE W FS QAAD GNNEAFSKLKLYAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NT PS ATLAS AAS STMTVTSGVAI STSVATANSTLTTASTSS S S S 
SNLNSGVSSNKLP3FPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
QTSALQTAGISGBSSS LPTQPHPD VSES TMDRDKVG I PTDGDS H 
A VTYPPAIVVYIIDPFTYE2^DEST1JSSSVWTLGLLRCFLEMVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid Begment containing signal peptide 
(A=*Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G«Glycine, 
H«Histidine, I»Isoleucine, X»Lysine, 
L=Leucine, M«Methionine, N*Asparagine , 
Psproline, Q«=Glutamine , R=Arginine, 
S«Serine, T«Threonine f V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








TLPPHIKSTVSVQI IPCQYLLQPVKHEDREI YPQHLKS LAFSAF 
TQCRRPLPTSTKVKTLTGFGPGLAMETALRS PDRPECIRLYAPP 
FILAPVKDKQTELGETFGEAGQKYNVLFVGYCLSHDQRWrrLASC 
TDLYGELLETCI INIDVPNRARRKKS SARKFGLQKL WE WCLC5LV 
QM S S LP WR W I GRLGRI GHGELKD WS CL LS R RNLQS LS KRLKDM 
CRMCG1SAADS PS ILSACLVAMEPQGSFVIM PDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTS CTHILVFPTS AS VQ VASATYTTENLDL 
AFKPNNDGADGMGIFDLLDTGDDLDP DI INI LPASPTGSPVHSP 
GSHYPHGGDAG XGQSTDRLLS TE PHEEVPN I LQQPLAXGYFVST 
AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLflS 
KKSHPIjDSNQTSDVLRPVLEQYNALSWLTCDPATQDRRSCLPIH 
FWLNQLYNFIMNMIj 


5370 


1226 


716 


RWSRKLELRRAAQATBSRPPQSQEMHPPTGKEVHALKRLRDSAN 
AraVEWO^LLEIX3ADPCAADDKGRTALHFASCNGNDQIVQIiLI» 
DHGADPNQRPGLGNTFLKIiAACTNHVPV I TTLLRGGAR VDALDR 
AG RTPLHLAKS KLN I LQEGHAQCLKAVR /HGGEADH P YAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPIiHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSLALAESLSLFRACTSLPVG 
GCISWL 


5371 


1331 


167 


IAAMLWKLLI^SQSCRLCSFRKMRSPPKYRPFLACFtYTTDKCS 
SKENTRTVE KLYKCS VD I RKI RR\ * KDG YF * RMKPMLKKLRI / P 
LQELGADETAVASILERCPEAIVCSPTAVNTQRKLWQLVCKNEB 
ELIKLIEQFPES FFTI KDQSNQKLNVQFFQE LGLKNWI SRLLT 
AAPNVFHNPVEKNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
NPFILLNSPTAI KETLEFLQEQGFTS FE I LQLLSKLKGFLFQLC 
PRSIQNSISFSKNAFKCTDHDLKQIiVLKCPALLYYSVPVIiEERM 
QGIiLREGI S 1AQIRETPMVLELTPQI VQYRIRKLNSSGYRI EGJG 
HLANLNGS KXE FEANFGK I QAKKVRP LFNP VAP LNVE E 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLILLFVTELSGAHNTTVFQGVAGQSLQVSCPYDSMKHWGR 
RKAWOIQLGEKGPCQRVVSTHNLWLLSFLRRWNGSTAITDDTLG 
GTLT I TLRNLQPHDAGLYQ CQS LHGS E ADTLRKVLVEVLAD P LD 
HRDAGDLWFPG\DLRASRM?MWSTAS?GASWKEKSPSHPLPSFS 
SW PAS FSSRF * Q PAPSGLQPGMDRS QGH I HP VNWTVAMTQG I SS 
KLCQG 


5373 


h 2814 


346 


VKKTK3 1 FN5 AMOEMEVY VENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLD p TN P S AGTAKIDKQEKVKLNFDMTAS P KI LMSKP VLSGG 
TGRRI S LSDMPRS PMSTNS S VHTGS D VEQD AE KKATS SHFSASE 
ESMD FLDKS TAS PASTKTGQAG S LS G S PKPFS PQLSAP I TTKTD 
KTSTTGSIIiNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRS RFQLNLDKTIESCKAQLG INE I SEDVYTAVEHSDSEDSEKS 
DS S DSEYISDDEQKS *GTSQEDTBDKEGCQMDKE PSAVKKKPKP 
TNP VE I KEELKS TSPASEKADPGAVKDKAS PE PE KDFSGKAKPS 
PHPlKDKLKGKDETDSPTVHLGIiDSDSE\NELVIDLGBDHSGRE 
GRKNKKEPKEPSPKQDVVGKTPPSTTVGSHSPPETPVLTRSSAQ 
TSAAGATATTS TSSTVTVTAPAPAATGS PVKKQRPLLPKE \ TAP 
AVQRS CGTS STVQQKEITQS PSTS T I TLVTSTQS S PLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YK0I^KN\TTV7KAQLAEDSQGLRIBIEKLQWLHQQEL\SBNKHN 
LELTMAEMRQS WEQERDRL I AEVKKQLELEKQQAVDETKKKQWC 
ANFKKSAI FYCCWNTSYCD YPCQ\ QAHWPEH \MKS CTQSATAPQ 
\QEADAE \ VNTETLNKSS QGSSSSTQS APSETASA\5KBKETS A 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5574 


2814 


346 


VKKTKS I FNSAMQEME VYVENIRRKFGVFN YS PFRTPYTPNSQ Y 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKIL^KP VLSGG 
TGRRISLSDMPRS PMSTNS SVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPAS TKTGQAGSLSGS PKPFSPQLS AP ITTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSHFQLNLDKTIESCKAQLGINEISEIJVYTAVEHSDSBDSEKS 
DS S DS E YISDDEQKS *GTSQEDTEDKEGOQMDiCEPS AVKKKPKP 
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ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

c or re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H«=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«=Methionine, N=Asparagine , 
P«» Proline, Q»Glut amine, R^Arginine, 
S=Serine, T=» Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNPVEIKEELKSTSPASEKADPGAVIOKASPEPEKDPSGKAKPS 
PHP I KDKXKGKDETDS PTVHLGLDS DSE \NELVI DLGE DHS QRE 
GRKNKKEPKEP S PKQDWGKTPPSTT VGSHS P PBTP VLTRS SAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE\TAP 
AVQRSCGTSSTVQQKBITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVS S VNGDLP IGTAS ADVAADI AKYTS KL\ MDAI KGTM \ TE I 
YNDLS KN\TT W KAQLAEDS QGLR I E I EKLQWLHQQElj\ SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQOAVDBTKKKQWC 
ANFKKEAIPYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEABAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSIIiIiGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSIibPGKESRAGTPFLGTSK 


5375 


2907 


1116 


H I FLAEEEPMLERRCRGPLAMG PAQPRLLSGPSQESPQ.YL6KES 
RGLRQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\LWLHTR 
RCQA/RGLPLPCPECGRRFRHAPPIALHRQVKAAATPDWGFACH 
LCGQS FRG WVALVLHLRAHSAAKAGP FACPKMARDAFWR RKAAS 
SSILRRCHPSRPRGPRPFICGNCGRSILPTWDQ/IiKVAHKRVHV 
SRR P * ERG P P AKVFWG PRPRGP PTG DTPPG PGGDAVDRP F\QCA 
CCGKRFRHK\PNLIRSHAACrSGERPHQ/CSRECG\KRFTWKPY 
LTS \HRRITHTARQPY PCKECGRRFRHKPNLLS HSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDP I EAPPSL YS CDDOGRS FRLERFLRAHQRQRTGERP FTCAEC 
G KN FGKKTHL VAHSR VHSG ERP FRLARKCGRRFLPRASQSGGRN 
SAEPNAPRFG PFVCPDCGKAFRHKP YLAAHRP I ATPAEKP YVCP 
DCRKAFSQKSNL\VSHRRIHTGERPYACPDCDRSFSQKSNLITH 
RKSHI RDGAFCCAI OGQTFDDEERLLAHQKKHD V 


5376 


4504 


591 


VST FS LCLWPAGGGGRGRVSNMAQS KRHVYSRTP SGSRMSAEAS 
AR P LRVG SR VE V 1 G KGH RGTVAY VGATLFATGKW VGVI LD EAKG 
KNDGTVQGRKYFTCDEGHGI F VRQSQIQVFEDGADTTS PETPDS 
SAS KVLKREGTDTTAKTS KLRGLKPKKAPTARKTTTRRPKPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPI I PTP 
VLTSPGAVPPIjPSPSKEEEGLRJ^VRDLEEKLETLRLKRAEDKA 

klkelekhkiqleqvqewkskmqeqqadlqrrlkearkeakeal 
eakerymeewadtadaiematldkemaeeraeslqqevealker 
vdelttdleilkaeieekgsdgaassyqlkqleeqnarlkdalv 
rmrdlsssekqehvk\lqklmekknqelewrqqrerlqeei*sq 
aestidelkeqvljaaiigaeemve^tdri^leekvrelretvg 
dleamnemndelqenaretelelreqldmagarvreaqkrveaa 

QETVADYQQTIKKYRQLTAHIiQDVNRELTNQQEASVERQQQPPP 
E^FDFKIKFAETKAHAKAIEMELRQMEVAQANRHKSLLTAFKPD 
SFLRPGGDHDCVLVLIiIiMPRLICKAELIRKQAQEKFELSENCSE 
RPGLRGAAGEQLSFAAIGLVY\SLMPAAGHRYHRY*CHALSQCR 
LDVVYKKVGSLYPEMSAHERSLDFLIELLHKDQLDETVNVEPLT 
KAIKYYQHLYS IHLAEQPEDCTMQIiADHIKFTQSALDCMS VEVG 
RLRAFLQGGQBATD I ALLLRDLETS CS \ DIRQFCKXIRRRMPGT 
DAPGlPAALAPGPQVSDTLIiDCRKHLTWWAVLQEVAAAAAQLI 
APLAENEG L LVAALEE LAFKAS EQ I YGTPSSSP YECLRQSCNIIi 
ISTMNK\LVTAMQEGEYDAERPPSKPPP\VELRAAALRAEITDA 
EGIXSLKI^DRETVXKELKKSLKIKGEELSEANVRIiTLLEKKLDS 
AAKDADER IE KVQTRLEETQALLRKKE KEFEETMDALQAD IDQL 
EAEKAEL KQR LNSQS KRT I EGLRGP P PSGI ATL VS G I AGEEQQR 

KGAQMKASLASLPPLHVAKLSHEGPGSELPAGALYRKTSQLLET 
LNQLSTHTHWD ITRTS PAAKS PSAQLMEQVAQLKS LSDTVEKLi 
KDEVLKETVSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFS CAAGFGQRHRL VLTQE QLHQLHS RL I S 


5377 


762 


1106 


DVPCKRVLPAEAQEKGQLTLS CGESGEEG\F* YHEVRQAEGES * 

/WFGPNVRLVHTQLKTKKPSGTLKA^ 

SS * WPG YDGWWGGQ YI FIFRGMRWEBQP 


5378 


2009 


664 


QASGTTLRPLPDLPQLKRREATS RNRALKPRGRLVLMTSCLPAX* 
RFIATPRIiSAMPHlDNDVKLDFKJDVLLRPKRSTLKSRSEVDLTR 
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ID 
NO: 


Predicted 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


a vAu Q^^iuenu containiu^ S15T13.X peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H-Histidine, I=Isoleucine/ K= Lysine, 
Ls=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q*Glutamine, RaArginine, 
S=Serine, ^Threonine, V«Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=$top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 






( 


SFS FRNSXQTYSGVPI I AA10MDTVGTFBMAKVLCKS + VPGSFWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPAS ILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLBHLAASSGTGSSDFEQLEQILEAIP 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTIMAGNWTGEMV 
EELILSGADI I KVG IGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI ISDGGCSCPGDVAKAFGAGADFVMLGGMLAGKSESGG 
ELIERDGKKYKLFYGMSS*! \AM\KKYAGGVAEYRASEGKTVEV 

pfkgdvehtirdilggirstctyvgaaklkblsrrttfirvtqq 
vnpifseac 


5379 


2009 


664 


qasgttlrplpdlpqlkrreatsrnralkprgrlvlmtsclpal 

RFIATPRLSAMPHIDtTDVKLDFfCn VLLRPKTJ <?TT.K<!P qputit td 
SFSFRNSKQTYSGVPI IAANMDTVGTFEMAKVIiCKS * VPG3FWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPAS I LVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLBQILEAI P 
Q VK YI CLD VANG YS BHF VEF VKD VRKRFPQHTI MAGNWTGEMV 
EELILSGADI I KVG IG PGS VCTTRKKTGVGYPQLSAVMECADAA 
HG LKGHI I SDGGCS CPGDVAKAFGAGADFVMLGGMLAGHS ESGG 
ELIERDGKKYKLFYGMSS * I \AM\KKYAGGVAEYRASEGKTVE V 
PF KGDVEHTIRDIIiGGI RSTCT Y VGAAKLKELSRRTTFI RVTQQ 
VNPIFSEAC 


5380 


2 


2050 


psraggaergraaaars pggsaagwecpsvldeagactwsscvs 
sqpssnraapqdelggrgssssesqkpcealrglsslsihlgme 
s p i wtece pgcavdlguvrdrp le adgqe vpldtsgsqarphl 
sgrklslqersqgglaaggsldmngrcicpslpyspvsspqssp 
rlprrptveshhvsitgmqdcvqlnqytlkdeigkgsygwkla 
ynendntyyamkvls kkxiiirqaafprrppprgtrpapggciqp 

RGP I \ EOVYOEIAV I LKKLDHPNW\ KT.VFVTA nnPNrntir.VMv 
F\ ELVNQGPVMEVPTLKPLSEDQARFYFQDLI KGIEYLHYQKI I 
H\RDIKPSNLLVGEDGHIKIADFGVSNEFKGSDALLSNTVGTPA 
FMAPES LS ETRKI FSGKALDVWAMGVTL YCF VFG * CP FMDERIM 
CLHSKIKSQALEFPDQPDI AEDLKDLITRMLDKNPESR I WPE I 
KLH P WVTRHGAE PLP S EDENCTLVE VTEEEVENS VKHI P S LATV 
ILVKTMIRKRSFGNPFEGSRRBERSL5APGNLLTKRPTRECESL 
SELKT*KlSPIjPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDIiVGAPGSHFCFLNIALLRYNSHTM 


5381 


2 


2050 


PSRAGGAERGRAAAARS PGGSAAGWE CPS VLDEAGACTMSS CVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSIiSlHXGME 
SPIVWECEPGCAVDIXSIJu^RPLEADGQBVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGVVKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPl\EQVYQEIA\lLIOCLDHPNVV\KLVEVL\DDPNEDHLYMV 
F\ELVN(X3PVMEVPTLKPLSEI>QARFYFQDLIKGIEYLHYQKII 
H \ RD I KPSNLLVGEDGH I KI ADFGVSNE FKGSDALLSNTVGTPA 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHSK I KSQ ALE FPDQPD I AEDLKDL I TRMLDKNPESRI WPEI 
KLHPWVTRHGAE PLP SEDENCTLVEVTEEEVENS VKHI PS LATV 
ILVKTMIRKRS FGNPFEGSRREERSLSAPGNLLTKKPTRECBSL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
* PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPIiPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM . 


5382 


153 6 


203 


GARGS QQDAP ALQ BAB VRGPERAQ PARGRMTKARLFRLWLVLGS 
VFM ILL 1 1 VYWDS AGAAHFYLHTS FSR PHTGP PLPTPG PDRDRE 
LTADS D VDE FLDKFLSAG VKQSDLPR KETEQ P P APGSMEE S VRG 
YDWS P RD ARRS P DQGRQQ AERR S VLRG FCANSS LAFPTKER P FD 
DIPNSELSHLIVDDRHGAIYCYVPKVACTNWKRVMIVI^GSLLH 
RGAPYRDPLRI PREHVHNASAHLTFNKFWRRYGKLSRHLMKVKL 
KKYTKFLFVRDP FVRLI SAFRSKFELENEEF/ * PQVRRAHAAAV 
RQPHQPARIiGARGLPRWPQ\VSFANFIQYLLDPHT3KLAPFNEH 
WRQVYRLCHPCQIDYDFVGKLETLDEDAAQIiQIiI^VDLAAPLP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid .spampnh rnnhaininn a j - -i _ _ _ ». J J_ 
•*** w 01 scyiuciiL. ^uiiua xu j. tig signal, pepcicte 

(Alanine, C«Cysteine, D=*Aspartic Acid, B* 

Glutamic Acid, F= Phenyl alanine , G=Glycine # 

H=Histidine, Ialsoleucine, K=Lysine, 

L=Leucine, M=Methionine, N*Asparagine, 

P»Proline, QeGlutamine, R=Arginine, 

S*Serine, T=Threonine, V=Valine, 

W-Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 

Codon, /^possible nucleotide deletion, 

\«possible nucleotide insertion) 








PBLPGTGPPSSWEEDWFAKIPLAWRCXSLYJCLYgADFVLFGYPKP'" 
BNLLRD 


5383 


45 

• 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRLQGISFGMYSAEEIiKKLS ' 
VKS I TN P R YLDSLGNP SANGLYDLALG PADS KEVCSTCVQDFSN 
CSGULQH1 ELPLTVYNPLLFDKL YLLLRGS CLKCHMLTCPRAVI 
HLLLCQLRVLEVGALQAVYELBRILSRFLEBNADPSASEIREEL 
BQYTTEIVQNNLLGSQGAHVKNVCESKSKLIALFWKAHMNAKRC 
PHCKTGRSVVRKEHNSKLTITFPAMVHRTAGQKDSBPLGIEEAQ 
IGKRGyLTPTSAREHLSALWKNEGFFIOTLFSGMDDDGMBSRFN 
PSVFFLDFLWPPSRSRPVSRLGDQMFTNGQTVNLQAVMKDWL 
IRKLLALMAQEQKLPEEVATPTTDEEKDSIjIAIDRSPLSTLPGQ 
SLIDKLYNIWIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEKKE 
GLFRKHMMGKRVDYAARSVICPDMYINTNBIGIPMVFATKLTYP 
QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 
Q REAVAKQ LLT PATGAP KPQGTK I VCRHVKNG D I LLLNRQ PTLH 
RP S I QAHRAR I LPEEKVLRLHYANCKAYNAJ)FDGDEWNAHFPQS 
ELGRAEAYVLACTDQQYLVPKDGQPLAGLIQDHMVSGAS MTTRG 
CFFTREHYMELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQWS 
TLLINI IPEDHIPLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 
BSQVI IREGBLLCGVLDKAHYGSS AYGLVHCCYE I YGGETSGKV 
LTCLARLFTAYLQLYRGFTL3VBD I LVKPKADVKRQR 1 1 BESTH 
CGPQAVRAALNLPEAAS YD3VRGKWQDAHLGKDQRDFNMI DLKF 
KEEVNHYSNEINKACMPFGLHRQFPENTLQLMVQSGAKGSTVNT 
MQ I SCI it iGQ I ELEGRS TPLMASGKSLp CFEP YEFTPRAGGF VTG 
RFLTG I KPPE FFFHCMAGREGLVDTAVKTSRSG YLQRCI I KHLE 
GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 
SNYEVIMKSQHLHEVLSRADPKKALHHFRAIKKWQSKHPNTLLR 
RGAFLS YSQKIQEAVKALKLESENRNGR/RPWDS /G/RMLRMWY 
ELDEESRRKYQKKAAACPDPSIiSVWRPDIYFASVSETFETKVDD 
YSQEWAAQTEKSYEKSELSLDRLRTLLQL\KWQRSLCEPGEAVG 
LLAAQS I G E P S TQM TLNT FHF AGRGE MNV TLG I PRLRE I LM VAS 
ANIKTPMM S VP VIjNTKKALXRVKSLKKQLTR VCLGEVLQKI DVQ 

RFFKLl^ESIKKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 
EQEGDEEEEGHIVDAEABEGDADASDAKRKEKQEEEVDYESEEB 
EEREGEENDDEDHQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
PSRP PDAAPETHPQPGAPGA\ EAMERRVQAVRB I H P FI DDYQYD 
TEESLWCQVTVKLPLMKINFDMSSIiVVSLAiKIAVI YATKG I TRC 
LLNETTNUKNE KELVLNTEG INLPELFKYAEVLDLRRLYSND IH 
AI ANT YG I EAALRVI EKE I KDVFAVYG IAVDPRHLS LYAD YMCF 
EGVYKPLNR FG I RS NS S PL QQMTFETS FQFLKQATM LGS HDE LR 
S PS ACIiWGKWRGGTGLFELKQPLR 


5384 


19 S 


886 


QSCGQRLPTVL*L*GPPGSCPCILSLF\PGRPHALPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGBPGPQGSKGDKGEMGSPG 
APCQKRFFA^SVGRKTALKSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRGI yffslnvhs wnyketyvhi mhnqkeavi LYAQPS 

ersimqsqsvmldlaygdrvwvrlfkrqrenaiysndfdtyitf 
sghlikaedd 


5385 


326 


739 


lmvprtkkeapappkaeakakai,\kakkavlkdvkshkknkihm 
sptfrrpktl*lrrqpkypwkstprrnkldhhvtikfpijtre*a 
vxk i enns ll vftvd vkankhqi kqa vxk / lcd id va k vntl i q 

SDGERKAYVRLAPDYDALVVATKIGIT 


5386 


326 


799 


lmvprtkkeapappkaeaxakal\kakkavucdvhshkknkihm 

SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHH7IIKFPLTTE*A 

VJGCIENNSLLVFTVDVKANKHQIKQAVKK/LCDIDVAKVNTLI 

SDGERKAYVRLAPDYDALVVATKIGIT 


5387 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWALA "" 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
L I VLMI LLAWICT VS AI MCVSMRGTICNPG PR KSMS KLL Y IRL 
AI^FFPEMWASI^AAWVADGVQCDRrVVNGIIATVVVSWIIIAA 
TWS III VFDPLGGKMAP YSSAGPSHLDS HDSSQLLNGLKTAAT 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, DsAspartic Acid, E» 
Glutamic Acid, F=* Phenyl alanine, G-Glycine, 
H^Histidine, I»Isoleucine, K-Lysine, 
L=Leucine, M=»Methionine, N=Asparagine, 
P-Proline, Q^Glutamine, K»Arginine, 
S=Serine, Threonine, V=*Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, **stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 




■ 


• 


S VWETRI KLLCCCI GKDDHTRVAFS STAELFS TYFSDTDLVPSD 
IAAGIALLHQQQDNIRNNQ3PAQWCHAPGSSQEADLDAELKNC 
HHYMQ FAAAAYG WPLYI YRNPLTG LCR iGfinrrw qvropnTMT /m 

VGGDQLQL/CTSAPILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHRKESVWAVRGTMSLQDVLTDLSAESEVLDVECEVQDRLAH 
KG I SQAARYVYQRLINDGIIjSQAFS iapeyrlvi vghslgggaa 
ALLATMVRAAYPQVRCYAFS P PRGLWS KALQBYSQSFI VS LVLG 
KDVI P RLS VTNLE DLKRR I LRWAHCNKPKYKI LLHGLWYELFG 
GNPNNLPTELDGGDQEVLTQPLLGEQSLLTRWSPAYSPSSDSPL 
i^oo r ^ i rr u i irruKX j.rtJUU£iJitiAac>RFGCCSAAHYSAKlVSHEAE 
FS K I L I GPKMLTDHMPD I LMRALDS WSDRAACVS CPAQG VS S V 
DVA 


5388 


1569 


753 


tAD^AGC^GRRQAGVRRHYLYPFTGGtfRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGPYGKPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 

w * n «x* ^ciCiV" AOJjB \HilH AAKXA V oLftt?K KnlJJjijGIjNAGVEMr 

TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5384 


1569 


753 


rADGGAGGGGRRQAGVRRtiyLVpPT < GGYRRRRAAC:QAERPAXfeS " 

KDTDLAAYQKGNIiGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 

TNGMCSVCYKEHI^RO^SSNGRISPPVQCTTJGSVPEAQSAIiDST 

SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASV3 

DTAQQPS EEQ 3 KS LE \NRNKKR IAVS CAGRKWD LLGLNAG VEM F 

TWYTVTQMYTIAI.TITKQMLKNFVFQQEFKSFGSFHQQLLEYK 


5390 


217 


1332 


KUPRKI^EDKMWSECEGPEMSLVCX^DFQAHAREQLSKSTRDFI 
EGGADDSITRDDNIAAPKRIRLRPRYLRDVSBVDTRTTIQGEEI 
SAP I C I APTGPHCLVWPDGEMSTARAAQAA\G I C Y I TSTFAS CS 
LEDIVIAAPEGLRWJ^LYVHPDI^IiNKQLlQRVESLGPkALVIT 
LDTPVCGNRRHDIRNQLRRKI»TIiTDLQSPKKGNAIPYFX3MTPIS 
TSLC WNDLSWFQS I TRLP I ILKGILTKEDAEIAVKHNVQGI I VS 
NHGGRQLDBVLAS I DALTE WAAVKGKI EVYLDGGVRTGNDVLK 
ALALGAKCIFLGDAI LWAtAS KGEHGVKE VLNI LTNEFHTSMA\ 
LTGCRSVAE INRNLVQPSRIi 




1 


1292 


fiwirtUADfturf i n<*{^Kv^tu&j\£xi I V WiiKKijOVKAWVKEMRGS c 

QPPVCNKI^HQEQLKVMPVGGPNTRKDYHIEEGEEVFYQLEGDM 
VLRVLEQGKHRDWIRQGE I FLL PAR VPHS PQR FANTVGLWER 
R RLETELDGLR YYVGDTMD VL FEKW FYCKDLGTQ LAP I IQEFFS 
SEQYRTGXPIPDQLLKEPPFPIiSTRSIMEPMSLDAWLDSHHREL 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDWLWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLS WGPS Y \AW\ERTQGS VALSVT\Q 
DPACKKSPWGEPSCHGLKAATGVPSTI,EVPSLPNNSPSPHYI,SV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
PVLPGGLPPAPLLPI PLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 


IRGSNAQKVVGASGSGGAG PQPDPAG PGGVPALAAAVIiGACE PR 
CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGS 
F X HKPAHG WLHPDARVLGPGVS YWR YMGCIEVLRSMRS LDFNT 
•k iv v Axttt/ixivitijric.iivFw VK(»i WKKKAPNKALAS VLGKSNLR FA 
GMSISIHISTOGLSLSVPATRQVIANHHMPSISFASGGDTDMTD 
YVAYVAKDPINQRACHI LECCEGL\AQS I ISTVGQAFELRFKQY 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 
C^LTOSRLALTQPCALTAIiDQG PS PSLRDACSLP WDVGSTGTAP 
PGDGYVQADARGPPDHEEHLYVOTOGLDAPEPEDSPKKDLFDMR 
PFEDALKLHECS VAAGVTAAPLPLEDQWPS PPTRRAPVAPTEEQ 
LRQEPWYHGRMSRPJU\ERMLRADGDFLVRDS\miPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFES IS HL I DHHLQNGQ P I VAAE 
SELHLRGWSREP 


5393 


2 


982 


GGDSAGMTMEIWSQNVCTRNLWI^PLTVI^IJ^ASADSQAAAP 
PKAVLKLBPPWINV1jQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


trz. cuiv lcu end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L~Leucine, M-Methionine, N«Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 

o(s*xiiCf x — luxcc/uine, vs valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








\NLI PTHTQPS \YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LS E WLVLQT P H LE PQEGETIMLRCHS \ WRDKP \LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGI IVAWIATAVAAIVAAWALI YCRKKRISAN 

o i ur v nnnyr E**rr Vj«ijn j. Al KAJtyJ-iniC J NHUi a IftlAAj X DTI LiNP 

RAPTDDDKN I YLTLPPNDHVNSNN 


5394 


2 


982 


ggdsagmtmetqmsqnvcpr1jlwllqpltvllllasadsqaaap 
pkavlkleppwinvlqNedsvtltcqgapqp/ersdsiqwfhng 

\NLIPTHTQPS\YRFKANNN\DSGBYTCQTGQTSL\SDPVHl,rV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGD YHCTONIG YTLFSSKP VTI TV 
Q VPS MGSSS PMGI I VAWIATAVAAIVAAWAI*! YCRKKRI SAN 
STDPVKAAQFEPPGRQMIAIRKRQLEET^3NDYETADGGYMTLNP 
RAPTDDDKNI YLTLP PNDHVNSNN 


5395 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS " 

SSGNPEAVAl^PDAYSTGSSSASSTLKRTKKPRPPSIiKKKQTTK 

KPTETPPVKETQQEPDEBSLVPSGEKLjASETKTESAKTEGPSPA 

LLBETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRjEFDYSEDKS 

SWDNQQENPPPTKK1GKKPVAKMPLRRPKMKKTPEKLDNTPASP 

PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPF5STSKMQBSPKL 

PQQSYKFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 

ANGVDGDGLNKPAKKKKTPLXTDT FRVKKS PKRS PLSDP PSQDP 

TPAATPETPPVISAVVBJ\TDEEKLAVTNQKWTCMTVDLEADKQD 

YPQPSDLSTFVNETKFSSPTEELDYRNSYEIBYMEKIGSSLPQD 

DDAPKKQALYLMFDTSQES P V KSS P VRMSESPTPCSGS S FEE TE 

ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 

SEAIEI TAPEGSFASADALLSRLAHPVSLOGALDYIVEPDLAEKN 

PPLFAQKLQREAAHPTDVS IS KTAL YSR IGTAEVEKPAGLLFQQ 

PDLDSALQIARAEI ITKEREVSEWXDKYEESRREVMEMRKIVAE 

YEKTIAQMIEDEQREKSVS\HQTVQQLVIiEKEQA\l*ADLNSVEK 

\SLADLFRRYBKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 

YQALKVHA\EEKIJDRANAE\IAQVRGKAQQEQAAHQASIAERSS 

CRV\DALERTLEQKNKEIBELTKICDELIAKMGKS 


5396 


3135 


531 


RASDAKNQEGLLKTRRKSTDSVPi^KiTi^RSLSLQASDFDGA^ 
S SGN PE AVALAPDAYS TGS S S AS STLKRTKKPRPPSLKKKQTTK 
KPTETP PVKETQQEPDEESLVPS GENLASETKTBSAKTEGPSPA 
LLEETP LE PAAG PKAACP LDS ES VEG WP PAS GGGR VQNS P P VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDK5 
S WDNQQENPPPTKKXGKKPVAKM PLRRP KMKKTPEKLDNTPAS ? 
PRSPAEPNDIPIAKSTYTFDiDKWDDPNFNPFSSTSKMQESPKL 
PQQS YNFDPDT CDESVDPFKTS S KTPSSPS KSPASFE I PASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQD? 
TPAATPETPPV ISAWHATDEE KLAVTNQKWTCMTVDLEADKQD 
x r auiia Lc vac. 1 1U? S>or I tiOUUiRNSxSXtJxnEKZQSSJjPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQH PVPRGLAPNQE SHLQVPE KS SQ KELE AMGLGTP 
SEAIEITAPEGSFASADAIJjSRLAHPVSLCGALDYLEPDIiAEKN 
PPLFAQKLQREAAHPTDVSISKTALYSR I GTAE VEKPAGLLFQQ 
PDLDS ALQ IARAE I ITKEREVS EWKDXYEESRREVMEMRKI VAE 
YEKTIAQMIEDEQREKSVS\HQTVQQI"VXiEKEQA\LADLNSVEK 
\SLADLFRRYEKMKEVLEGFRiCNEEVLKRCAQEYIiSRVKKEEQR 
YQALKVHA\EBKLDRAKAE\ I AQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKI CDEL IAKMGKS 


5337 


313S - 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS " 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETP PVKETQQEPDEESLVPS GBNLASETKTESAKTEGPSPA 
LL E ET PLE PAAG PKAAC P LDS E S VEG WP P AS GGGRVQNS PP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGH3VRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAEPND I PIAKGTYTFD I DKWDDPNFNP FSSTSKMQESP KL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment containina a limAl nAnh-irlo" 
<A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H»Histidine, X»Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
PoProline, Q^Glutamine , R*Arginine, 
S-Serine, T«Threonine, V« Valine, 
W^Tryptopban, Y»Tyrosine, X=Unknown, *«Stop 
Codon, /=po88ible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDKSVDP^kWiK^PSSPSi^PAS^IPASAME 
ANGVTCIXSLNKPAKKKKTPLKTDTPRVKKSPKRSPIjSDPPSQDP 
TPAAT PETPPVI S AWHATDEE KLAVTNQKWTCMTVDLEADKQD 
VPQPSDLSTFVNETKFSSPTEELDYRNSYBIBYMEKrGSSLPQD 
DDAP KKQALYLMFDTS QES PVKS SP VRMS ESPT PCSGSS FEE TE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELSAMGLGTP 
SEAIBITAPEX3SFASADAI»LSRLAHPVSX*CGALDYLEPDLAEKN 
P PLFAQKLQREAAHPTDVS I S KTALYSR IGTAEVE KPAGLL FQQ 
PDLDSALQ I ARAE I ITKEREVSEWKDKYEESRRBVMEMRKIVAE 
Y EKT I AQM I EDEQREKSVS \HQT VQQLVLEKEQA\ tiADUJS VBK 
\ S LADLFRRYE KMKEVLEG FRKNEE VLKRCAQE YLSRVKKEEQR 
YQALKVHA\EEKLDRANAE \ IAQVRGKAQQEQAAHQASXAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5398 


56 


5426 


SGEVCRMESNPMQEGVpRPSYVFSADPIARPSBlNFt)GIKLDLS 

HEFSLVAPNTEANSFESKDYLQVCLRIRPFTQSEKELESEGCVH 

ILDSQTWLKEPQCILGRLSEKSSG\QM\AQKFS FFPGFLGPAT 

TQKEFFQGClMHP\VXDLLKGQSRIiIFTYGLTNSGKTYTFQGTB 

ENlRILPRTLNVLFDSljQERLYTKMNIiKPHRSRBYLRLSSEQEK 

EEI ASKSALLRQI KE VTVHNDSDDTLYGS LTNSLNI SEFEES I K 

DYEQANLNMANS IKFSVWVSFFEI YNEYI YDLFVPVSSKFQKRK 

MLRLS QDVKG YS F I KDLQWI QVSDS KBAYRLLKLG I KHQS VAFT 

KLNNASSRSHSIPTVKILQIEDSEMSRVIRVSELSLCDIAGSBR 

TMKIX}NEGERLRETGNINTSUiTLGKCINVI,KNSEXSKFQQHVP 

FRESKLTHYF/QSFFNGKGKICMIVNISQCYIAYDETLNVLKFS 

AIAQKVCVPDTLNSSQEKLFGPVKSSQDVSLDSNSNSKILNVKR 

ATISWENS LEDLMEDEDLVEELENAEETED /VGETKLLDEDLDK 

TLEENKAFrSHEEKRKLI«DLI EDLKKKLINBKKEKIiTLEFKI RE 

EVTQEFTQYWAQREADFKETLLQEREI LEENAERRbAI FKDLVG 

KCDTREEAAKDICATKVETEEATACLELKFNQIKAELAKTKGEL 

IKTKEELKKRENESDSLIQELETSNKKIITQNQRIKELINIIDQ 

KEDTINEFQNIiKSHMENTFKCNDKADTSSIjIINNKLICNETVEV 

PKDSKSKI CS3RKRVNENELQQDEPPAKXGS IHVSSAITEDQKK 

SEEVRPNIAJSrEDIRVLQENNEGLRAFLLTIE^EIiKNEKEEKAE 

LNKQIVHFQQSLSLSEKKNLTLSKEVQQIQ3NYDIAIAELHVQK 

S KNQEQEE KIMKLSNE I ETATRS ITNNVSQ IKLMHTK1 DELRTL 

DS VSQISN I DLLNLRDLSNGS EEDNLPNTQ LDLLGND YLVSKQ V 

KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 

QI EKLQAEVKGYKDEK1JRLKEKEHKNQDDX»1jKEKETL iqqlkbe 

LQEKNVTLDVQIQHVVEGKRALSELTQGVTCYKAKIKELETILE 

tqkvershsakleqdilekesiilklernlkbfqbhlqdsvknt 

KDLNVKELKLKEEITQLTNNLQDMKHLLQLKEEEEETNRQETEK 
LKEEtjSASSAR7X}N\XJflAI)LQRKEEDYADLKEKLTDAKKQXKQV 
QKE VS VMRDEDKLLRI KlNELEKKicNQCSQELDMKQR\ TIQQL K 
EQL INQKVEEAI QQYERACKDLNVKEKI IEDMRMaLEEQEQTQV 
EQDQVL \ EAKLSEVERLATELDR WR VXCNDLETKNNQRSNKEHB 
NKTDVLGKLTNLQDELQESEQKYNADRKKWLEEKMMLITOAKEA 
E N I RNKEMKKYAEDRB RFF KQQNE ME I LTAQLTE KD S DLQ KWRE 
ERDQIiVAALE IQLKAL I S SNVQKDNE I EQL KRI I SETSKI ETQ1 
MD I KP KRI S SADPDKLQTE PLSTS FE I SRNKI EDGS WLDS CEV 
STENDQSTRFPKPELEIQFTPMPNKMAVKHPGCTTPVTVKIPK 
ARKRKSNEMEEDL VKCENKKNATPRTNLKFP I S DDRNSS VJCKEQ 
KVAIR PSS KKT YSLRSQAS 1 IGVNLATKKKEGTLQKFGD FLQHS 
PSII^SKAKKIIETMSSSKLSNVEASKENVSQPKRAKRKLYrSE 
ISSPIDISGQVILMDQKMKESDHQI IKRRLRTKTAK 


5399 


705 


230 


GPRMAXFLSQDQINEYKECFSLYDKQQRGKIKATDU1VAMRCLG 
ASPTPGEVQRHLQTHGIDGNGEIiDFSTFLTIMHMQIKQEDPKKE 
lLIiAh3LMVDKEKKGYVMASDLRSKLTSLGEKLTHKEV\DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


248 - - 


sMcssGMeipptnypasraai.vaqnyinyqqgtphrvfevqkvk 

QASMBDI PGRGHKYRLKFAVEBI IQKQVKVNCTASVLYPSTGQE 
TAPEVNFTFBGETGKNPDBEDOTFYQRLKSMKBPLEAQNI\PDN 
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" SEQ 
ID 
NO: 


1 Predicted "~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


^unmo acaa segment containing signal peptide 
<A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine. G=Glycine, 
HeHistidine , I=Isoleucine, K=Lysine r 
L&Leucine, Ma Methionine, N=Asparagine, 
Pa Proline, Q=Glutaraine, R«Arginine, 
SsSerine, T^Threonine, V-Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown , *=Stop 
Codon, /«possible nucleotide deletion, 
\apossible nucleotide insertion) 








FGNVSPEMTLVLHLAWVACGYI IWQNSTEDTWYKMVKIQTVkQV 
QRNDDFI ELiDYTI LLHNIASQB 1 1 PWQMQ VLWHPQY GTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSLAFIiAPRDFPFPPKLIjIHPQAVVRLSCGAGSMGS 

kwaprqddmlfyvrrklaysgsesgadgrkaaepevevevyrrd 
skklpglgdpdidweesvclnlilqkldymvtcavctradggdi 
hihkkksqqvfaspskhpmdskgeeskisypniffmidi5f\ee\ 
vfsdmtvgkgemvcvel vasdktntpqgvi fqgs i r ybalkkvy 
dnrvsvaarmaqk\msfgfskysnmef\vr\mkgpqgkghaema 
vsrvstgdts pcgteedsspas pmhervtsfstpptpernnrpa 
ffspslkrkvprnriaemkkshsandseeffreddggadlhkat 
nlrsrslsgtgrslvgswlklnradgnfllyahltyvtlplhri 
ltd i le vrqkp ilmt 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEKV 
PGADILNS YAGLACVEEPNDMITESSLDVAEEE I IDDDDDDITL 
TVEASCHDGDBTI ET I EAAEALLNMDSPGPMLDEKRINNNIFSS 
PEDDMVVAP VTHVS VTLDGI PEVMETQQVQEKYAES PGASSPBQ 
PKRKKGRKTKPPRPDS PATTPN I S VKKKNKDGKGNT I YLWBFZiL 
ALLQDKATCPKYIKWTQREKGI FKLVDS KPVSRLWRKHKNKP\D 
MNYEPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDL1YINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTS TMQ DETLNS S VQS IR \ T IQAPTQVPVWS P 
RNQQ\ LHTVTLQTVPLTTVIAS TDPSAGTGSQKFI LQAI PSSQP 
MTVLKENVMLQSQKAGS PPSIVLG PARV\QQVI»TS NVQTI CNGT 
VSV\ASSPSFS\ATAPWTI*FLLGSSQliVAHPPGTVITSVIKTQ 
ETKTLTQE VEKKESEDHLKENT EKTEQQPQ P YVMVVS S SNGFTS 
QVAMKQNELLEPNSF 


5403 


3445 


1563 


GECFI MAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADI LNS YAGLACVE E PNDM I TE5 SLDVAEEE I IDDDDDDITL 
TVEAS C3flX3DETIETIBAAEAIiNMDSPGPMLDBKRINNNI FSS 
PEDDM WAPVTHVS VTLDG I PEVMETQQ VQEKYAD SPGAS S P EQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLODKATCPKYIKWTnPPKflTPirrtVntlK'OVCPT W3 xrvtwnvT3\ n 

MNYE PMGRALR YYYQRG I LAKVEGQRIiVYQ FXEMPXDLI YINDB 
DPSS S I ESSDPSLSSSATS NRNQTSRSRVSSSPGVKGGATT VLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEJ^TSTMQDETLNSSVQSIR\TIQAPTQVPVVVSP 
RKQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKENVMLQ SQ KAGS P P S I VLGP AR V\ QQVLTSNVQT I CNGT 
VSV\ ASSPSFS \ATAPWTLFLIiGSSQLVAHPPGTVI TSVT KTQ 
ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSS SNGFTS 
QVAMKQNELLEPNS F 


5404 


187 


1111 


LPVTLIFAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRHYDYGT 
DNFEES I FS QDYED KYLDGRN I KBKETVI I PNEKSLQLQKDE AI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
L YAR FNKI KKL T \ AKDFADI PNLRRLDFTGNL IED I EDGTFS Kit 
SLVEELSLAENQLLKL P VL PPK LTIiFNAKYWKI KSRGZ XANAFK 

KLNNI»TFLYTjDHNA1iES VPL»NTjPE QTjPVTHT /ITS'NTJTa Q TTHHTP 

CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


2199 i 


1220 


QNS RSLHMDPQNQHGSGSSL W I QQPSLDSRPRLD YERE IQ PTA 
ILSLDQIKAIRGSNEYTEGPSWKRPAPRTAPRQEKHERTHEII 
PINVNNNYEHRHTSHLGHAVLPSNARGPILSRSTSTGSAASSGS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQH KF I CEQCGKCKCGECTAPRTLPS CLACNRQ CLCSAE 
SMVEYGTCMCLWKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCKLBSCPSRGQGKPS 


5406 


279 [ 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMIiENY 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


«»>ij.4&v ov-.u ot^mcut containing signal peptide 
<A=Alanine, C=Cysteine, D=»Aspartic Acid, En 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, l»Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N«Asparagine , 
P=Proline, G>Glutamine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNI»VFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHEWVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKD^KSVDE 
CKVHRGGYNGFNQCLPATQSKI PLFDKCVKAFHKFSNSNRHKI S 
HTEKKLFKCKECGKSFCMLSHLAQHKIIHTRVNFCKCEKCGKAF 
NCPS I ITKHKRINTGEKPYTCEBCGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS ILTTHKI IRTGEKFYKCKECAKAFNQSS 
NI.TEHFOCIHPGEKPYKCEECGBCAFNWPSTLTKHIGaHTGEKPYT 
CEECG KAFNQF SNLTTHKR IHTA\EKPYKCTEOGEAFSRS \ S NL 
i j.h l b JUiPXKUsBOGKAFKWSSKLTEHKLTHTQEKPYKCE 
KCGKAFNCPS I ITKHNRINTGE KPYTCEECGKVFNWSSRLTTHK 
KNYTRYKLYKCEECGKAFNKSSILTTHKKIHIEKXFYKCEECGK 
AFKWS S KLTEHK I THTGEK? YKCE EGGKAFNHFS I LTKHKR I HT 
GEKPYKCEECGKAFTQSSNLTTHKKIH'fGEKFYKCEEOGKAFXQ 
" WH1J * * nAJ *Jn j. uuAri Aujtse.ctjAArWyro XHKAIHTEEKP 
YKCEE CGKAFKWS STLTKHKI I HTGEXP YXCEECG\ KAFKLS ST 
LSTHKI IHTGEKP YKCEKCGKAFNRPSNLIEHKKIHTGBQPYKC 
EECGKAFN YS SHLNTHKR IHTKEQP YKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTVI LTTPQTFS NI K 


5407 


3 


€59 


RPRRRQS SCCTG WLAGWLLRAAPR FCRRTETDMEQGKGLAVL 1 1> " 
AIILLQGTLAQSIKGNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKM IGFLTEDKKKWNLGSNAKDPRGM YQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAE1 VS I FDLAVGVYFIAGTGMEFR 
QS \RASDKQTLLP \NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 


£408 


2745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQR^P 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSti 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSLAIAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGB 
VRNKDMS WPEEMS FI ANSS KIDRHKVPTEKGATGLS NLGNTCFM 
NSSI QCVSNTQPLTQYFISGRHLYELNRTMPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGPQQQDSQELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAVJDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFJjSLPLPMDSYMHL 
EITVIKLDGTTPVRYGIiRLNMDEKYTGLKKQLSDLCGLNSBQIL 

laevhgsniknfpqdnqkvrls vsgflcafei pvpvspisass p 
tqtdfssspstnemftlttngdlprpifipngmpntwpcgtex 

NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFIiSSQKNR 

pslfgmpli vpctvhtrkkdlyeavw iqvsrlas plppqeasnh 
aqdcddsmgyqypftlrwqkdgnscawcpwyrfcrgckidcge 

DRAFIGNAYIAVDWHPTAT.pt iP VnTCnPUTnmpuvcTnp/i e?r>r» » o, 

vh p inld s clrafts ee elg enem yycs kc kthclat kki»dlwr 
lppiliihlkrfqfvngrwiksqkivkfpresfdpsaflvprdp 
alcqhkpiitpqgdelsepritjarevkkvdaqssageedvllsks ' 
psslsaniisspkgspsssrksgtscpssknsspnssprtlgrs 
kgrlrl p q i gs knkls s ske nldas kengagq i celadalsrgh 
vxxx^qpelvtpqdhbvalangflyeheacgngcgngysngqiig 

NHSEEDSTDDQREDTRIKPIYNLYAISCHSGIIjGGGHYVTYAKN 
PNCKWYCYNDS SCKE LHPDE IDTDSA Y I LFYBQQGI DYAQFLPK 
TDGKKMADTSSMDBDFESDY\EKYCVLQ 


5409 


2745 


6128 


qgskgtchpqaqqpwdegv«qeApsOsbpwgqsqepptmpqrlp~~ 

HARQHTPLPLGSADYRRWSVR PQG PHRD PKDS RDAAKREQGS L 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSUaAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERS CWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEKSFIANSSKIDRHKVPTBKGATGLSNLGNTCFM 
NSS IQCVSNTQPLTQYFISGRHLYELNRTNP 1GMKGHMAKCYGD 
LVQELWSGTQXNVAPLKLR WT I AKYAP RFNG FQQQDS QELLAF L 
LDGIil^DLNRVHEKP YVE LKDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFHGQLRSQVKCKTQGHI8VRFDPFNFLSLPLPMDSYMHL 
EITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLN3BQIL 
IAEVHGSNlKNFPQDNQKVRjOSVSGFIiCAFEIPVPVSPISASSP 
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ID 
NO; 
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to first 
amino acid 
residue of 
amine acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(JUAlanine, C=Cysteine, DsAspartic Acid, B= 
Glutamic Acid, F«Phenyl alanine, G«*Glycine, 
H-Histidine, I-Isolaucine, K-Lysine, 
L=Leueine, M«Methionine, N=»Asparagine, 
P»Proline, QsGlutamine, R=Arginine, 
S=Serine, TsThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




TQTDFSSSPSTNEMFTLTTNGDLPRPIFIPNGMPNTWPCGTEK 
NFTNGMVNGHM PS LPDS P PTG YI I AVHRKMMRTEIiYFLS SQKNR 
PSLFGMPLIVPCTVHTRKKDXiYDAVWIQVSRLASPLPPQBASNH 
AQDCDD SMG YQYP FTLRWQKDGNS CAWCP WYRFCRGCKIDCGE 
j-tu-lt mnAi xtxYunriir lALiriLiKxyi oy&KvvDjSHESVEQSRRAQ 
VEPIWLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 
LPP I L I IHLKRFQFVNGRW I KSQXI VKFPRES FDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRI LAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLSSSKENLDASKENGAGQICELAJDALSRGH 
vuuvoy ftbv x Fyutib ViyjANGFIiYBaEiACGNGCGNGYSNGQlJG 
NHS B EDS TDDQREDTR I KP I YNL YAI S CHSG I LGGGHY VT YARN 
PNCKWYCYNDSSCKELHPDE I DTDSAYILFYBQQGIDYAQFLPK 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5410 


2 


710 


LSFPGQARHWLAARMQAPHKEHL YKLL VIGDLGVGKTS 1 1 KRY 
VHQNFSSHYRATIGVDFALKVLHWDPETWRLQLWDIAGQERFG 
NMTRVYYREAMGAPIVFDVTRPATFEAVAKWKNDLDSKLSLPNG 
KPVSVVLLANKCDQGKDVLMNNGLKMDQFCKEHGFVGWFETSAK 
ENINIDEASRCLVKHILANECDLMESIEPDWKPHLTSTKVASC 
SG\CAKI LVGTFAGVW 


5411 


1302 
1 3180 


289 


TGPAAAGRRKALGS FGKPS P VTGLRAARRRRTRPSAPAAPS VGC " 
G KRRESDAGAGGERAS VRTG SGRRGGRTMAGDSEQTtiQNHQQ ?N 
GGEP FL I G VSGGTASGKS S VCAK I VQLLGQNE VDYRQKQ WI LS 
QDSFYRVLTSEQKAKALKGQFNFDHPDAFDNELILKTLKEITEG 
KTVQI PVYDFVSHSRKEETVTVYPAD VVLFKGILAFYSQBR/ 1 R 
DLFQMKLFVDTDADTRLSRRVLKDISERGRDLEQILSSSTLRFV 
KPA\ FEE FCIiPPK\KYADVI I PR\GADN\RVPINL I VQHIQ\D I 
IiNGGPS \NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 




313 


QGISNFFHKEANFWFE VSG YL I SPLRS PFVDPALE WSLMAS PWN 
KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHE I FRDSSLV 
NEQSQ I TRRKKRKKDFQHIi I S S PLKKSR I CDETANATSTLKKRK 
KRRYSALEVDEBAGVTVVLVDKENINNTPKHFRKDVDWCVXIMS 
IBQKLPRK\ PKTDKFQVLAJC5H\AHKSEAJLHSITVREKKNKICHQR 
KAASWESQRA\RDTLPOSBFPTQEESWLSVGPGGEITELP\ASA 
HKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 
GLDDETPQLLGPTHKKiCSKKKKKKKSNHQEFBSIjtAMPEGSQVGS 
EVGADMQES \RPAVGLHGETAGI PAPAYKNKSKKKKKKSNHQEF 
EAVAMPES LESAYPEGSQVGS E VGTVEGS TALKGFKESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 
VKS RPRQKKTQACLAS KHVQEAPRLE PANESHNVETAEDS B IR Y 
uo/u/^wru/wL'dXJ>iL/ij\>oAv A.UJjQj&r iPNlKDRATSTI KRMYRDD 
L ER FKEF KAQGVAI K FG KFS VKENECQLE KNVEDFLAL TG IES AD 
KLL YTDR Y PE EKS VI TNLKRRYS FRLHI G \ RNI ARPWKLI YYRA 
KKMFDVNN YKGRYSEGDTEKLKMYHSLLGNDWKTI GEM VARRSL 
S VALKFSQ I S S QRNRGAWS KS ETRKIiI KAVEEVI LKKMS PQELK 
BVDSKLQENPESCLSIVREKLYKGISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I YYGMNALRAKVSIiI ERL YB INVEDTNE I 
DWEDLASAIGDVPPS YVQTKFSRLKAVYVPFWQKKTFPEI IDYL 
YETTLPLLKEKLEKMMEKKGTKIQTPAAPKQVFPFRD I FYYBDD 
S EGGGHRKRKRRPRRHAW FTP VI PVLWEAKAGWI I 


~ 5413 — 


37S3 


1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPIjLRRTARPGGG 
TPLLNGAGPGAARQSPRSALFRVGHMSSVKLDDEIiLEP\DMDPP 
HPFPKBIPHNEKLLSLKYESLDYDNSEKQLFIiEEERRINHTAFR 

tveikrwvicaligiltglvacfidiwenlaglkyrvikgnid 
kftekgglsfslli.watlnaafvi*vgsvivafiepvaagsgipq 
ikcfi^gvkiphvvrlktlvikvsgvtlsvvgglavgkegpmih 
sgsviaagisqgrstslkrdfkifbylrrdtekrdfvsagaaag 
vsaafgap vggvlpsleegas fwnqf ltwr i ffasm i stftlnf 
vlsiyhgnmwdlsspglinfgrfdsekmaytiheipvpiamgw 
ggvlgavfnalnywltmfriry ihrpclqvi eavlvaavtatva 
fvliyssrdcx)plqggsmsyplqlfcadgeynsmaaaffntpek 
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ID 
NO: 


Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nHuiiu a *" j - u ocyinenc coiiuainin^ signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P"Proline, Q-Gl ut amine , R=Arginine, 
S=»5erine, T=Threonine, V=Valine, 
W= Tryptophan, Y»Tyrosine, X=>Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








SWSLFHDPPGSYNPLTLGLFTLVyFFLACWTYQLTVSAGVPlF" 
SLL IGAAWG R L ?G I S LS Y LTG AAI WADPGKYALMGAAAQLGG I V 
RMTLSLTV1MMEATSNVTYGFPIMLVLMTAKIVGDVFIEGLYDM 
HIQLQS VPFLHWEAP VTSHSLTARE VMSTP VTCLRRRBKVGVI V 
DVLSDTASNHNGFPWEHADDTQPARLQGL I LRSQL I VLLKHKV 
FVE RSNI/3LVQRRLRLKDFRDAYPR FPP I QS IHVSQDERECTKD 
LS B FMNPSP YTVPQEAS L PRVPKLFRALGLRHLWVDNRNQ WG 
LVTRKDLARYRIiGKRGLEELSLAQT 


5414 


| 2130 


390 


GVAS AWDRAL FS PLLS PTS RVFRTS P PRC VSTETGRRDRARVP S 
QWCSVLQGICLPVSGRTSLACVRSILLSPASSPRKVGIVGGTGAR 
AGAAPRDHGRVRHRRPSSARRMTRTTGQCLAPRGCQGPRGTRSP 
RSPRSRTRRGCSASPACLP/CRSALIVAVLCYINriLNYMDRFTV 

RKYLMCGGIAFWS LVTLGS SFI PGEH FWLLLLTRGLVGVGEAS Y 
STIAPTIilADLFVADQRSRMLSlFYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGLGWAVLLLFLWREPPRGAVERHSDLPPL 
NPTSW WADLRALARNPS FVLSSLGFTAVAFVTGSIiALWAPAFLL 
RSRWLGETPPCLPGDSCSSSDSLIFGLITCLTGVIjGVGIiGVEI 
SRRIiRHSNPRADPljVCATGLtjGSAPFLFLSLACARGS IVATYIF 
IFIGETLLSMNWAI VAD I LLYVVI PTRRSTAEAFQI VhSHLUSD 

AGSPYLIGLISDRLRRNWPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


S4lS 


693 


2986 


IPPKTKLELQKH \LTTLT \NQEQAT1 FEEVQKLRPRNEQRENEL 

I IS PLRCLFEBKQKEHIHIGEMKQTSQMAAENIGS ELPPSATRF 

RLDMLKNKAKRSl»TESI>ES ILSRGNKARGLQBHS I SVDLVSShS 

STLSNTSKEPSVCEKEALPISESSFKLLGSSEDLSSDSESHLPE 

EPAPLSPQQAFRRRANTLSHFPIECQEPPGPARGSPGVSQRKLM 

RYHSVSTETPHEKKDFES KANHLGDSGGTPVXTRRHSWRQQ I FL 

RVATPQKAC3DSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 

EKKRTSRELRELWQKAILQQI LLLRMEKENQKLQASENDLLNKR 

LKLD YEE I TPCT»If RVTTVUi? vur .cvonxy a vt vcnhrovMUA * •» 

* aa * * f^ufto v x i v nCtiS-PiLto x JrVKoK-tKr DriEKWHSAVGQ 

GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKELLKQLT 
SO^HAILIDLGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQE 
VGYCO^LSFVAGILLLHMSEEEAFW^KFLMFDMGLRKQYRPDM 
I ILQIQMYQLSRLLHDYHRDLYNHLBEHEIGPSLYAAPWFLTMF 
ASQFPLGF VARVFDM I FLQGTE VIFKVALSLLGSHKPLI LQHEK 
LETIVDFIKSTLPNLGLVQMEKTINQVFEMDIAKQLQAYEVEYH 
VLQEE LIDS S PLS DNQ RMD KLEKTNS SLRKQNLDLLEQ LQ VANG 
RIQSLBATIEKLLSSESKLKQAMLTLELERSALIiQTVEBLRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEOKDPYFVETPYGYQLDLDFLK" 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPIjETSLPFtTIP 
ENRQLPPPS PQLPKHNLHVTKTLMETRRRLEQERATMQMTPOEF 
RRPRLASFGGMGTTSSLPSFVGSGN^PAKHQLQNGYQGNGDYQ 
S YAPAAPTTSSMGSS I RHSPLSSGISTPVTNVS PMHLQHIRBQM 
AIALKRLKELEEQVRTI PVLQVKI S VLQEEKRQLVSQLKNQRAA 
SQ INVCGVRKRS YS AGNASQLEQLSRARRSGGELYIDYEE EEM E 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RS VAVGAEENMNDI WYHRGSRS CKDAAVGTLVEMRNCGVSVTE 
AMI/5VMTEADKEIELOX3QTIESLKEKIYRLEVQLRETTHDREMT 
KL KQ ELQAAG S RKKVD KATNAQ P LVFS KWEA WQTRDQMVGS H 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPSLPMNWWIVKER 
VEMHDRCAGRS VEMCDKS VS VBVS VCETGSNTEES VWDLTLLKT 
NLNLKEVRS I G CGDCS VD VTVCS PKE CAS RGVNTEAVS Q VEAA V 
MAVPRTADQDTSTDIjEQVHQFTNTETATL IESCTNTCLS TLDXQ 
TSTQTVETRTVAVGBGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
PSAVXTKESGVGQININDNYLVGLKMRXIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPO^PLGMMTGLDHYIERIQKLljAEQQ 
TLLAENYSELAEAFGEPHSO^GSLMSQLISTLSSINSVMKSAST 
BELRNPDFQKTS I^KITGSYLGYTCKOGGLQSGS PLSSQTSQPB 
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ID 
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Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted &rrci 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, lelsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q«Glut amine, R-Arginine, 
S-Serine, ^Threonine, V=Valine, 
tf= Tryptophan, Y=Tyrosine, X*Uhknown, *=*stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








0EVGTSFfik'PT^^r.'n2vlrDTnT?rtTT.caTnvfr TnnrkTftH.r'r vnr^piunvT 
ESTUCS IMK3CKDGNKDSNGAKKNLQFVGINGGYETTSSDDSS SD 
ESSSSESDDECDVIEYPLEBEEEEBDBDTRGMAEGHHAVNIEGL 
KSASVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
P KALTS KDMR FCLNTLQHEWFRVS S QKSA I PAMVGD Y I AAFEA I 
S PDVLR Y V I NLADGNGNTALHYSVSHSNFEI VKLLIiDADVCNVD 
HQNXAG YTP IMLAAIAAVEAEKDMRIVBELFG OGDVNAKASQAG 
QTALMIAVSHGRIDMVKGIJ^a?ADWlQDDEGSTAl^CASEHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGR KTS PG PTHRGS FD 


5417 


27 

• 


4074 

• 


KSQIiFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
Y VDDIQKGNTI KRIiNIQKRRKPSVPCPEPRTTSGQQGI WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGBF 
RRPRLAS FGGMGTTS S LPSFVGSGNHNPAKHQLQNGYQGNGDYG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVHTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQIiEQLSRARRSGGELYlDYEEEEME 
TVEQ3TQRIKEFRQIi\TADMQALEQKIQDSSCEASSEI>RENGEC 
RSVAVGAEENtWDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKEIELQQQTIESLKEKIYRLEVQLRETTHDREMT 
KLKQSLQAAGSRKKVDKATMAQPLWSKVVBAVVQTRDQMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKVVGPELPMNWMIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESWDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECASRGVNTEAVSQVBAAV 
MAVPRrADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRS1GVGTLLSGHSGFDR 
PSAVKTKES GVGQIN1NDN YLVGLKMRTIACGP PQI iTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTOI.DHYXERIQKLIAEQQ 
TLLAENY5ELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKITGS YLGYTCKCGGLQSGS PLSSQTS QPE 
QEVGTSEGKP ISSUDAF P TQEG TLS P VNLTDDQ IAAGLYACTNN 
BSTLKS IMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSS SD 
ESSSSESDDECDVIEYPIiEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQ VQECBPE KVEIRER YEIiSBKMLSACWLLKNTIND 
P KALTS KDMR FCLNTLQHEWFRVSSQX.SA I PAMVGDY I AAFEA I 
S PDVLRYVINLADGNGNTALHYS VSHSNFE I VKLLLDAD VCNVD 
H QNKAGYT P I MLAALAAVEAEKDMR I VEELFG CGD VNAKAS QAG 
QTALMLAVSHGRIDMVKGLLACGADVNIQDDEGSTALMCASEHG 
HVE I VKLLLAQPGCNGHLEDNDGSTALS IALEAGHKDIAVLLYA 


S418 


24 


1133 


SVPRAGGDMBTGAAELYDQALLX3ILQHVGNVQDFLRVLFGFLYR 
KTDFYRLLRHPSDR]>K3FPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAE VPR\ EPPI 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQVSVALSSSSIRVAMLEENGERVLM3GKLTHKINTESSLWSL 
EPGKCVLVNLS KVGE YWWNAI LEGEEP ID IDKINKERSMATVDE 

EEQAVLDRLTFDYHQKWJGKPQSHBLKVHEMLKKGVfDAEGSPFR 
GORFDPAMFMISPGAVnP 


5419 


1395 


259 


GTHPLDPDLVSRTSVQGPLM^HA^PGMSPtEESPFLGPRAAEEG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
lu^PASLPQC/LGP/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRIQQWQQSPCIAEEHGKKLLERIRREQOSARTRI.QEMERRFH 
ELEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSC3GHPINPR 
VALRHMERCYAKYESQTSFGSMYPTRIEGATRLFCDVYNPQSKT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHY CWEKLPJRAEVDLERVRVWYKI*DELFEQERNWTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQIjFHBRIR 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D*Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, GsGlycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leueine, M«Methionine, N-Aeparagine, 
P^Proline, Q=Glutamine, JUArginine, 
SsSerine, T-Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECIISTLLFATLYILCHI^LtoidCPAEFTT\GMMKMPPSTRL/ 
LLELCTFTLAIALGAVLLLPFSI I<3M'3 , l7T.T.CT.i>P'KrwTrvtJT wrae 

LIHGLWNLVFLFSNLSL I FLMP FAYFFTESBGFAGSRKGVLGRV 
YETVVMLMLLTLLVLGMVWVASAI VDKNXANRESLYDFWEYYLP 
YLYS CI S FLGVLLLLVCTPLGLARMFS VTGKLLVKPRLLEDLEE 
QIiYCSAFBEAALTRRICNPTSCWLPLDMELLMRQVLiAIiQTQRVL 
LE KRRKAS AWQRNI/3 Y PIjAMLCLLVLTGL S VLI VAIHI LELLID 
EAAM PRGMQGTS LGQVS FS XLGS FGAVI Q WL I F YLMVS S WGF 
YSSPLFRSLRPRWHDTAMTQIlGNCVCLIiVLSSALPVFSRTLGL 
TRFDLLGDFGR FN WI/3NF Y I VFL YNAAFAGLTTLCLVKT FTAAV 
RAELIRAFGERE 


5421 - 


117 


1733 


NEAGGACPFKGGASGRL YLS PRLPR VS VAGCEER? LGW VW VLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
ECI I STLLFATL YI LCH I FLTRFKKPAEFTT\GMMKMPPSTRL/ 
uijcuv. if i JUM.XALrtjKVJjjjijl'r a 1 XodI kvLiLoLPRNYYIQWLNGS 
LIHGLWNLVFLFSNLS L I FLMPFAYFFTESBG FAGS RKGVLGR V 
YETVVMLMLLTLLVIjGMVWVASAIVDKHKANRESLYDFWEYYLP 
YLYSClSFLGVLLLLVCTPLGLARMFSVTGKLLVKPRIiLEDliEE 
QL YCSAFEEAALTRR ICNPTS CWL PLDMELLHRQ VLALQTQRVL 
LEKRRKASAWQRNLGYPLAMI^LVLTGLSVLIVAIHILELIjID 
EAAMPRGMQGTS LGQVS FS KLGS FGAVIQWL I F YLMVS SWGP 
YSSPLFRSLRPRWHDTAMTQI I GNC VCLLVLS S AL PVFSRTLGL 
TRFDLLGDFGR FNWLGN FY I VFL YNAAFAGLTTLCLVKT FTAAV 
RAELIRAFGERE \ 


5422 


3 


1263 


t>^u \L t>x>f rw UACAAy KPU I GRKGGAWGGRGGSS PAQ VLLS PGP VF 
KAGCNW WHLSRDQAG VQRCDLGS SQP PPLG FKR FS CLSLPSS WD 
YRSTVLCVSKMEADLSGFNI DAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGWP PGTQVEQLI .YAKKLYDSAF 
HPDTGE KMNVIGRMSFQLPGGMI ITGFMLQFYRTM PAVI FWQWV 
NQS FNALVNYTNRNAASPTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP PLVGRW VPFAAVAAANCVNI PMMRQQELI KGICVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERLEKLH 
FMQKVKVL/SAPLQVMbSGCFLIFMVPVACGLFPQKCELPVSYL 
EPKLQDTI KAKYGELEPYVYFNKGL 


5423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGS CRERELDI PGPMSGEQ 
PPRLEAEGG LI SPVWGAEG I PAPTCWIGTDPGG PS RAHQPQASD 
ANRE P VAERSE PALS GLP PATMGSGDLLLSGESQVEKTKLSfl S E 
EFPQTLS LPRTTI CSGHDADTEDDPSLADLPQALDLS QQPHS SG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
aSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGnASGL 
GRRRLSFQAE YWACVLPDSLPPS PDRHSPLHN PNKE YEDLLD YT 
YPLRPGPQLPKKLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 
TNVS PNCP PAEATALPFSG PRE PSLKQWPS RVPQKQGGMGLAS W 
SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSBEEVESDDEY 
LALPARLTQVSSLVSYLGSISTLVTLPTGDIKGQ5PLEVSDSDG 
PASFPS9S SQSQLPPGAALQGSGDPEGQNPCFLRS FVRAHDSAG 
EGSLG S S QALGVS SGLLKTR PS LPARLDRWP FSD PDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNLTS LK\ S S LQLY RQFKKD I DE HQSLTES VLQKGE I LLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDSILASLDMLAGCTLIP 
DKKPMAAMEHPCEGV 


5424 


3186 


905 


GV5MALGEEKAEAEASEDTKAQSYGRGSCRERELDIPGPMSGEQ 
PPRLBAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGES QVEKTKLSS SE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCrjSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQBRAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAE YWACVLPDSLPPS PDRHSPLWNPNKEYEDLLDYT 
YPLRPG PQLPKHLDSRVPADP VLQDSQVDLDSFSVS PASTLKS P 
TNVSPNCPPAEATALPFSOPREPSLKQWPSRVPQKQGGMGLASW 
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nmj.uu qwiu »eijiiie.uu containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, B*= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, lelsoleucine, X»Lysine, 
L-Leucine, M-Methionine, N«Asparagine, 
P=Proline, Q=Glut amine, R-Arginine, 
S=Serine, T«Threonine, V=Valine, 
^Tryptophan, Y» Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQLASTPRAPGSRDARWERRBPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSERBVESDDEY 
LALPARLTQVSS LVS YLGS 1 S TLVTLPTGD I KGQS PLE VSDSDG 
PASPPSSSSQSQLPPGAAIiQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSIiGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNLTSLK\SSLQLYRQPKXDIDEHQSLTESVLQKGEILLQCLLE 

DKKPMAAMEHP CEJG V 


5425' 


1086 


115 


GFCPSPSI/SHQPPRVLHPTMSMAVETFGFFM^^ 
NSYWRVSTVHGNVITTNTIFENLWFSCATDSLGVYNCWEFPSML 
ALSGYIQACRALMITAlLLGFLGLLLGIAGIiRCTNIGGLELSRK 
AKLAATAGAPH\ ILPGICGMVAI \SWYAFNITR\DFSDPLYPGT 
KYELGPALYLGWSASL1SILGGLCLCSACCCGSDEDPAASARRP 
YQAPVSVMPVATSDQEGDSSFGKYGRNALRVAALCRGPRCLPTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWE VAYLPSEAGSLI F 


5426 


42 


3435 


ATS S QS LGRADP PRGGTM ERSPGEGPS PS PMDQPS APSD PTDQP 

PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLI PVYLA 

GAVGLSVGFVLFGLALYLGWRRVRDEKEKSLRAARQLLDDEEQL 

TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 

LLAETVAPAVRG SNPHLQTFTFTRVELGEKPLRI IGVKVHPGQR 

KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 

BPLIGDLPFVGAVSMFFIRRF^LDIiWTCMTNLLDIPGLSSLSD 

TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGIIR1HL 

LAARGLS S KDK YVKGL I EG KSDPYALVRLGTQTF CSRV X DE ELK 

PQWGET YEVMVHE VPGQE I EVEVFDKDPDKDDFLGRMKLD VGKV 

LC2ASVIJDDWFPLQGGQGQVHLRLEW/jSLLSDAE2CLEQVLQWNWG 

VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 

MVQLS IQDVTQES KAVYSTNCPVWEEAFRFFLQDPQSQBLDVQV 

KDDSRALTLGALTLPLARLLTAPEL1LDQWFQE.SSSGPNSRLYM 

KLVMRILY3jDSSEICFPTVPGCPGAWDVDSENPQRGSSVX)APPR 

PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 

VKLKIjAGRSFRSHVVREDIiNPRWNEVFEVIVTSVPGQELEVEVF 

DKDLDKDDFLGRCKVRiTTVLNSGFLDEWLTLEDVPSGRLKLRL 

ERLTPRPTAAELEEVI^WSLIQTQKSAELAAALLSIYMBRAED 

LPLRKGTICHLSPYATLTVGDSSHKTKTISQTSAPVWDESASPLI 

RKPKTESLEI^VRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 

S SGQGQ VLLRAQLG I L VSQHSGVEAHSHS YSHS SSSLSEE PELS 

G3PPHITSSAPEV\RQRLTHVDSPLEAPAGPliGQVKLIXWYYSS 

ERKLVS I VHGCRSLRQNGRDP PDPYVSLLLLPDKNRGTKRRTSQ 

KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 

LGKVQLDLAETDLS QG VARW YDLMDN KDKGS S 


5427 


42 


34*5 


atssqslgradpprWMerspgegpspspmdqpsapsdptdqp 
paahakpdpgsggqpagpgaageaiavltsfgrrllvlipvyla 
gavgls vgfvlfglal ylgwrrvrde kerslraarqllddeeql 
taktl ymshr elpaw vs fpd ve kaewlnkivaqvwp flgqymex 
llaetvapavrg snphlqt ft ftrvelgekplr i igvkvhpgqr 

KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPI»IGDLPFVGAVSMFFIRRPTIjDINWTGMTNLLDIPGIiSSLSD 
TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
LAARGLSSKDKYVKGIjIEGKSDPYALVRLGTQTFCSRVIDEELN 
PQWGETYEVMVHBVPGQE I EVEVFDKDPDKDDFLGRMKLD VGKV 
J^ASVLDDWFPLQGG^QVHLRLEWI^LLSDAEKLEQVLQWN^JG 

vssrpdppsaailwyldraqdlpmvtselyppolkkgnkepnp 

MVQLS IQDVTQES KAVYSTNCP VWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VXLKIAGRSFRSHVVREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDIJDKDDFliGRCKVRLTTVLNSGFIiDEWLTLEDVPSGRLHI^ 
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— — **^ 3 uks*»w uwuuttining signal peptide 
(A*Alanine , C=Cysteine, D=*spartic Acid, B« 
Glutamic Acid, F= Phenyl alanine, G=Glycine 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M»Methicnine, N=Asparagine, 
P=Proline, Q-Glutairine, R-Arginine, 
S -Serine, T=Threonine , VaValine, 
H=Tryptophan, Y»Tyrosine, X=Untaiown, **Stop 
Codon, /^possible nucleotide deletion 
\=possible nucleotide insertion) 








' E KLTPRP TAA^LEE VLQWSLIQTQK5AE1AAALLS I YMERAED 
L PLRKGTKHLS PYATLTVGDSSHKTKTI SQTSAPVWDES ASFLI 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
S SQQGQVLLRAQLGI LVSQHSGVEAKSHS YSHSSSS LS EE PELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLMYYSE 
ERKLVS I VHG CRSLRQNGK DPPD P YVSUXLPDKNRGTKRRTS Q 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSS FMSREPJ3L 
LG KVQLDLAETDLSQG VARWYDLMDNKDKGSS 


5428 


3 


1839 


SS RS ERLS ACAI APPWL VS SR PAR PAQLQRPGXMVEDGAEELED 
LVHFSVSELPSRGYGVMEElRRQGKLCDVTLiCIGDHKFSAHRIV 
LAAS I PYFHAM FTNDMMECRQDE I VMQGMDPSALBALINFAYNG 
NLAI DQQKVQS L LMGAS FLQLQS I KDACCT PL R ERLH P KNCLGV 

RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFLALPLEDVLEL 
VSRDELMVKfiE'PriT7T7Rli3^T.lvtfftrDvT»tjn»r»Ti/-i»iiEiT v n»T _ 
v x\o muv r tv>iAJ-»rtW v K YLJKEQKGTFL \RNLQSNI RLL 

FCRPQPLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHIiP 

AFRTR PRCCTS I AGLI YAVGGLNSAGDSLNWEVFDP I ANCWER 

CRPMTTAnSRVGVAWNGLLYAIGGYDGQLRLSTVQAYNTETDT 

WTRVGSMNSKRSAMGTVVLDGQIYVCXSGYDGNSSIjSSVETYSPE 

TDKWTWTSMS SNR^AAN f5VTVRl?n© T wcrroivi mT T»r» r»i rm, 

ynhhtatwhpaagmlnkrcrhgaas lgskmfvcgg ydgsg FLS I 
AEM ys s v\adqwcli vpm\htrr\ srvslggpavgrlyavwg vt 
tgqsnl\ssvgdvltpetdojtfm\apmacheggvgvgcipllt 


5429 


828 


202 


RREDALSSEGCIiWPSE3TVSGNGIPEPQVYAPPRPTDR"LAVPPF 
AQRER FHRFQPT YP YLQHE I DLPPT ISLS dgeeppp yqgp ctlq 
LRDPEQQLEIiNRESVRAPPNRTIFDSDLMDSARIiGGPCPPSSNS 

gisatcygsggrmegppp\tysevighypgssfqhqqssgppsl 

LEGTRLHHTH I APLF.SAA TU Q ICR YTW OTf fUOT 


5430 


441 


1507 


QKRRKRRRKKlMKTIQPKMHNStSWAlFTOLAALCLFQGVPVRS~ 
GDATFPKAMDNVTVRQGE S ATLRCT I DNRVTRVAWLNRST I LYA 
GNDKWCLDPRVVLLSNTQTQYSIEIQNVDVYDEGPYTCSVQ'TDM 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATCRPEP 
TVTWRHI S PKAVGFVSEDEYLEI QGITREQSGDYECS ASNDV\A 
APV\VPJ^VKVTVNYPPYlSEAi05TGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRLI/EGKKGVKVENRPFLSKLIFFNVSEHDYGNYT 
^ASNKIXnTTNAS IMLFGPGAVSEVSNGTSR^ 


5431 


2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYRAPPPPappQDZkUDADcvnTrT — 

LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGEJ^KDDDFERISBIiGAGNGGVVTKVQHRPS 
GLIMARKLIHIjEIKPAIRNOIIRELQVIJIECNSPYIVGFYGAFY 
SDGEISI CMEHKDGGS LDQVLKE AKR I PEE ILGKVS IAVLRG LA 

YLREKHQ imhrdvkpsnilvnsrgeiklcdpgvsgqlidsmans 
fvgtrsymaperlqgthysvqsdiwskglslvelavorypippp 
dakeleaifgrpwdgeegephsisprprppgrpvsghgmdsrp 
amaifelldyivnbpppklpngvftpdfqefvnkclixnpaera 

PLKMLTNHTF I KRSE VSBVDFAG WLCKTLRUTQPG TPTRTAV 


5432 
5433 


2 


1312 


AAAAPGSRRRR PLPDR PHMAHGYEAPPP PAPfe£ PAwp &P c trpv\ — 
LPGITINP\TIAEGPSP\TSBGASEANLVDLQKKLBELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGVVTKVQHRPS 
GLIMARKLIHLEIKPAIRNQ2 IRELQVLHECHS PYI VGFYGAFY 
S DGS IS I CMEHMDGGS LDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YLREKHQ IMHRDVKPSWILVNSRG B I KLCD FG VS GQL IDSMANS 
FVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAI FGRP WDGEEGEPHS I S PRPRPPGRPVSGHGMDSR P 
\MAIFEIJ^YIVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLlMLimrFlXRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 




360 


1885 


SVQEDKVGFEJJPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
[jFGWPS LVFVFKNEDYFKDLOGPDAG PIGNATGQADCKAQDERF 
5 L I F TLGS FMNNFMTFPTG YIFDRFKTTVARL IA I FFYTTATL I 
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Amino acid segment containing signal peptide 
(A^Alanine, OCyBteine, D«Aspartic Acid, E« 
Glutamic Acid, F- Phenyl alanine , G=Glycine, 
H»Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, RaArginine, 
SaSerine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=UoJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=rpossible nucleotide insertion) 








IAFTSAGSAVLLFLAMPMLTIGGILPLITNLQIGNLFGQHRSTI 
ITLYNGAFDSSSAVFLIIKLLYEKGISLR/VLIiHIjHLCLQYLAC 
STHFPPEtAPGAHPlPrAPQLQLWPVPWEWHHKGREXS/QQLSMKT 
GSYSQRSSFQRRKUPQGQGRSRNSAPSGATL/CSRRFAWHLVWL 
S VI QLWHYLF IGTLNSLLTNMAGGDMARVSTYTNAFAFTQFG VL 
CAPWNGLLMDRLKQKYQKEARKTGS S TLAVAIiCSTVPS LALTSL 
LCLGFALCAS VPILPLQ YLTF I LQ VI SRS FL YGSNAAFLTIAFP 
S EHFGKL FG L VMAL S AWSL LQ FP I FTL I KG S LQND P FYVNVMF 
MLAILLTFFHPFLVYRBCRTWKESPSAIA 


5434" 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/GSGK 
HGPFUSCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECEHTBLIDKPDBTAITCPOCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGECPECHYPLLIEKKT 
AQGVKHFCASKQCX3KPVSAE 


5435 


4 704 


1S97 


PGDSSQRLAEMSNAKERKHAKKMRNQPINVTLSSGFVA15RGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQEI PKYITASTFAQARAAEISAMLKAV 
TQKSSNSLV FQTLPRHMRRRAMSHNVXRIiPRRLQE IAQXRAEKA 
VHQKKEHS KNKCHKARRCHMNRTLEFNRRQKKN I WLETH I WIIAK 
R PHMVKKWG YCLGERPT VKSHRACYRAMTORCIiLQDLS YYOCJjE 
L KG KE EE I LKAL SGM CN I DTGLTFAAVHCLS G KRQGSLVL YRVN 
KYPRBMLGPVTFIWKSQRTPGDPSESRQLWIWIiHPTLKQDILEE 
I KAACQC VEP I KS AVClADPIiPTPSQEKSQTEIiPDEKIGKKRKR 
KDDGENAKPIKKI IGDGTRDPCLPYSWISPTTGI I ISDLTMEMN 
RFRI>IGPLSHSILrBAIKAASVHTVGEDTEETPHRWWIETCKKP 
DSVSLHCRQEAI FELLGGI TSPAEIPAGTILGLTVGDPRINLPQ 
KKSKALPNPEKCQDNEKVRQLLLEGVPVECTHSFIWNQDICKSV 
TENKISDQDLMRMRSELLV?GSQLILGPHESKlPIIiLIQQPGKV 
TGEDRLGWGSGWDVLLPKGWGMAFWIPFIYRGVRVGGIiKESAVH 
SQYKRSPNVPGDFPDCPAGMLFAEEOAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWEQLTQDWESRVQAYEEPSVAS S PNG KESDL 
RRSEVPCAPMPKKTHQPSDEVGTSIEHPREAEEVMQAGCQESAG 
PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGR RAPGRGQQGLTREACIiS ILGKFPRALVWVSLSLLS KGSPE 
PHTMI CVPAKEDFLQLHEDWH YCGPQESKHSDP FRSKILKQKEK 
KKREKRQKP\GRASSIX3PAGEEFVAGQEAIiTLQLW3GPLPRVTIi 
HCS RTLLGFVTQGDFSMAVGOGEALGFVS LTGLLDML3SQPAAQ 
RGLVLLRPPASLQYRPAR I AIE V 


5436 


1781 


635 


ASDS I PWSEARTTRKLAQRGCQWSLPERMPLWFCGLP YSGKSR 
RAEELRVALAAEGRAVYVVDDAAVLGAEDPAVYGDSAREKALRG 
ALRASVERRLSRHDWILDSLNYIKGFRYELY\CIiARAARTPLC 
LVY CVR PGGP I AGP Q VAGANENPGRNVS VS WRPRAEED GRAQ AA 
GSS VLREliHTADS WNGSAQADVPKELEREESGAABS PALVTPD 
SEKS AKHGSGAFYS PELLBALTLRFEAPDSRNRWDRP LFTLVGL 
ESPIiP LAGIRSALFENRAPPPHQSTQSQPLASGS FLHQLDQVTS 
QVLAGLMEAQ KS AVPGDLLTL PGTTEHLR FTR PLTMAELS RL RR 
QFISYTKMHPNNENr.PQLANMFLQYLSQSLH 


5437 


739 


. 1472 


CQEAASEFGGPLHTPA^FLRRLGGWLPRPKGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
PSKEBPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WOHVDSGGTRR PG VS P EGGL \GVPGPGAPLEKPGRRE KLLGWLR 
GEPGAPSRYIXSGPEECIX3ISTNLTLHIJ*ELLASALLALCSRPLR 
AALDT IiGL RGPLGLW LHGLLS FLAALHGLHAVLSLLTAHPLHFA 
CLFGLLQAL VLAVS LREPNGDEAATDWESEGLEREGEEQRGDPG 
KGL 


5438 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSIiRRPMMCQSEARQGPELRAAKWLHFPQIAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTPGVNMKEAAVE 
DLHHYRNLSEFFRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 
NCEVEQVKGVTYS LES FLGPRMCTEDLPFPPAASODS FKNQLVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AftAlanine, CaCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L«Leucine, MaMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=po3sible nucleotide insertion) 








regnelyhcviylapgdyhgfhsptdwtvshrrhfpaslmsvnp 
gmar w ikelfchnerwltgdwkhgffsltavgat \ nwgs iriy 
pdrdlhtnsprhskgsyndfsfvthtnregvpmalrgehlg/qs 
fnlgsti vli feapkdfnfqlktgqkirfgealgsl 


5439 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDL1ARGKGRKEENKGSDRVS 
LAP P S LRRPMMCQSEARQG PELRAAKWLHFPQIiAIiRRRLGQLS C 
MSRPALKLRSWPLTVLYYLXjPPGALRPLSRVGWRPVSRVALYKS 

vptrllsrawgruvqvblphwlrrpvyslyiwtfgvnmkeaave 
dlhhyrnlsepprrklkpoarpvcglhsvispsdgrilmpgqvk 
nceveqvkgvtyslesflgprmctedlpfppaascdsfknqlvt 
regnelyhcviylapgdyhcfhsptdwtvshrrhfpgslmsvnp 
gmarwi kelfchn3rvvltgdwkhgffsltavgat\nmgs iriy 

FDRDLlfraSPRHSKBSYNDFSFVTHTNREGVPMALRGEHLG/QS 
FNLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALGSli 


5440 


593 


2S3 


EPI PVTPDHRLVTMTHI V\QTFSPVNS \GQPPNYEMLKEEQEVA 
MLGAPHNPAPPMSTVIHIRSETSVPDHWWSLFNTLFMNTCCLG 
FIAFAYSVKS RDRKMVGDVTGAQAYASTAKCLN I WA^I LG I FMT 
ILLIIIPVLWQAQR 


5441 


2 


2054 


CRDGGKNGFMVSPMKPLE IKTQCSGPRMDPKICPADPAFFS FIN 
NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTl»RILYEEVDESEVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIAJbKIAEFQTDSQGKIVSTQEKE 
LVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLL 
P PAL FI PSTENEEQ \ RLAS ARAVPRNVQP Y WYEE VTNVWINVH 
DIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSQGYDW 
SEPFSPGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHLYWS 
YE AAG E I VRLTTPGFSHSCSMSQNFDMFVSHYS S VSTPPC VHVY 
KLSGP DDD PLHKQPRFWASMMEAAKI FHFHTRSDVRL YGM I YKP 
HALQPGKKHPTVLFVYGGPQVQLVNNSFKGIKYLRLNTLASLGY 
AVWI DGRGSCQRGLRFEGALKNQMGQVEI EDQVEGLQFVAE KY 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTG YTER YMD VPENNQHGYEAGS VALHVEKLPNE PNRLLILK 
GFLDENVHFFHTNFLVSQLIRAGKPYQLQVALPPVS PQI YPNER 
HS IRCPESGEH YEVTLLHFLQE YL 


5442 


1 


$474 


(^RSRRRSPDMPEAXPAAKKAPKGKDAPKGAPKEAPPKEAPAE 
APKEAPPEDQSPTAEE PTGVFLKKPDSVSVETGKDAVWAKVKG 
KELPDKPT I KWFKGKWLEljGS KSGARFS FKESHNSASNVYTVEL 
HIGKWLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 
ESFKRTSEKKSDTAGELDFSGLLKKRBVVEEEKKKKKKDDDDLG 
I P PE IWELLKGAKKS E YBKIAFQYG I TDIjRGM LKRLKKAKVE VK 
KSAAFl'KKIiDPAYQVDRGNKIKLMVSISDPDLTLKWFKNGQEIK 
PSS KYVFEN VGKKRILT INKCTLADDAAYBVAVKDEKCFTELFV 
KEPPVLIVTPLEOQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEL 
TREDSFKARYRFKKDGKRHILI FSDVVQEDRGR YQVITNGGQCE 
AELI VE EKQLE VLQDI ADLTVKASEQAVFKCEVS DEKVTGKWYK 
NGVEVRPSKRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 
GS LSAKLNFLE IKVE YVPKQ\EPP K I PLGFASGGKTSENAD/ IV 
WAGNKLRLD V \S ITGEAPS PFAT\WLKG\DEVFTTTEGRTRIE 
KRVDCSS FVTESAQREDEGRYTIKVTNPIGBDVAS IFLQWDVP 
DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSQ 
RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 
TKPFM PIAPTS3 PLHLI VED VTDTT7TLKWRPPNR I GAGGI DG Y 
LVEYCLEGSEEWVPANTEPVERCGFTVKNLPTGARILFRVVGVN 
IAGRSEPATLAQPVTIREIAEPPKIRLPRHLRQTYIRKVGEQLN 
LWP FQG KPR PQ WWTKGG APLDTSRVHVRTS D FDTVFFVRQAA 
RSDSGE YELS VQ I ENMKDTATIR3 RWEKAGP P INVMVKE VWGT 
NALVEWQAP KDDGWSE I MGYFVQKADKKTMEWFNV YERNRHTS C 
TVS DL I VGNE YY FRVYTENI CGLS DSPGVSKNTAR I LKTG I TFK 
P PEYKEHDFRMAPKFLTPLIDRWVAGYSAAIiNCAVRGHPKPKV 
VWMKNKME IREDPKFL I TNYQGVLTLNIRRPS PFDAG TYTCRAV 
NELGEAliAECKLEVRVPQ 
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1 SEO — 

ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=»Aspartic Acid, B= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H»Histidlne, I«=IsoleuciJie, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
Pa Proline, Q=Glut amine, R=Arginine, 
S*Serine, T»Threonine, V^Valine, 
WoTryptophan, Y=Tyroaine, X= Unknown, *»Stop 
\-uu.un, /tpQssioie nucleotide deletion, 
\«possible nucleotide insertion) 


5443 


66 


1003 


SRGQLiDAGQSS EQHGGNRQ PEQS RS RSS S S SSS PRRSRSAAE PA 
MALS MPLNG LKEEDKE PLI ELFVKAGSDGES IGNCPFS QRLFMI 
LWLKGWPSVTTVDLKRKPADLQNLAPGTHPPFI TFNSEVKTDV 
NKI EE FIiE EVLCPPKYLKLS PKH PESNTAGMDI FAKFS AYIKNS 
RPEANEALERGIiLKTLQKLDByLWSPIiPDEIDENSMEDlKFSTR 
KFLIX^MTliADCmjLPKLHlVKVVAKKYRNFDIPKEMTGIWRy 
IiTNAY SRDEFTNTCPS DKEVE I \ AYSDVAKRIjHQVKSRIiLKE VS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPTPDYTESDILRAY 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
1 KVEAADMARAKALLGGPGEELBADTEYLDPFDAQPHPAPPDDG 
YMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QSRMPQEDERPADEYDQPWEWKKDHISRAFAVQFD5PEWERTPG 
S AKELRR P PPRS PQPAERVDPAL PLE KQP WFHG PLNR ADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFPS VPEfiVLHYSSRPLPVQGAEHLALLYP WTQTP* Q 
* PDWGDRRPNGQVATGLPE LWGAEAPSAAAHPGLHRERH PEGLP 
RAEKPGLRG PLLGLRE PLGAG PRGP WGLQE PRRCQVWFSQAPAH 
QGCGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


ILS RGFLGS VE I CIQLPL PASEP VLLliTWARRR WRETRSRREPT 
TLRAQS VCPW W I * BTRMNRSIPVEVDESEP YPSQLLKPI PEYSP 
EBESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKLANH 
QRPVSRQVTCLRTQVIiEDSEDSFCRRHPGI*GKAFPSGCSAVSEP 
AS ES WGALPAEHQFS FMEKRNQWL VSQLSAASPDTGHDSDKS D 
QSLPNASADSWSGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLBRPIiPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QML P PNIjS phap WNYHYHCPGS PDHQ VP yghd ypraayqqviqp 
ALPGQPUPGAS VRGLHP VQXVILNYPS P WDQEERPAQRDCS FPG 
LPRHQDQPHHQPPNRAGAPGESLECPABLRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDI FEDRIRGIDI IKWMER YLRDKTVMI I VAIS PKYKQ 
DVEGAESQLDEDSHGLHTKYIHRMMQIEFIKQGSMNFRFIPVLF 
PNAKKEHVPTWLQhnHVYSWPKNKKNILLRLIiREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SSWSWCTGRMRKTRLWGLLWMLFVSELRAATKLTEEKYELKEGQ 
TIjDV KCDYTLEKFASSQKAWQI I RDGEMPKTLACTERPS KNSHP 
VQVGRI ILEDYHDHGLLRVRMVNIiQVEDSGLYQCVrYQPPKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYKIPPTTTKALCPLYT 
TpRTVT^PPKSTADVSTPDSEI^Ttn/TOIIRVPVFNlVIIiLA 
GG FLS KS LVFSVL FAVTLRS FVP * AHE PTRMSSDFQPHPSGSCA 
KGGGRR 


_ 5447 


207 


617 


MTARTIiS LMAS LVAYDDS DS EAETEHAGS FNATGQQKDTSGVAR 
PPGQDFASGTIiDVPKAGAQPTKHGS CEDPGG YRLPLAQLGRS DR 
GSCPSQRLQWPOKSPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCVVPY 
ii/KjUjKUKUfUj£> 1 tJLlsAGKJJVEPQGPPAGRAPAPLYVGPGVSEF 
IQPYLNSHYKETTVPRKVLFHIiRGHRGPVNTIQWCPVLSKSHML 
LSTSMD KTF KVWNAVDSGHCLQTYS LHTBAVRAARWAP CGRRIL 
SGGFOFALHLTDLETGTQLFSGRSDFRITTIiKFHPXDHNIFLCG 
GFSSBMKAWDIRTGKVMRSYKAriQQTIiDILFLREGSEFLSSTD 
ASTRDSADRT1IAWDFRTSAKISNQIFHERFTCPSI*ALHPRBPV 
FLAQTNGNYLALFSTVWPYRMSRRRRYEGHKVBG YS VGCECS PG 
GDLLVTG SADGRVLM YS FRTAS RACTIiQGHTQAC VGTTYHPVLP 
S VLAT CS WGGDMKI WH *AFHWLS LGEA I G DLAPARG YSGBQRSL 
KSPSPSKSLLVLLCGRAMFQPATCPWQLPAIiSK 


5448 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRNKPKKTA 
HVKPDL IDVDLVRGSAFAKAKPESPWTS LTTKG I VRWF FPF FF 
RWWLQVTSKVIFFWLL VLYLLQVAA I VL FCSTSS PHSI PLTE VI 
GPI^MLLLGTVHCQIVSTRTPKPPLSTGGKRRRICLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKWSIDKSTETDNGYVSLDGKKTVKSGEDGIQNHBPQCBT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rrccicLcci ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=?Aspart ic Acid, E» 
Glutamic Acid, F=Phenylalanine , G*Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q-Glut amine, R=»Arginine, 
S^Serine, T«Threonine, V=Valine, 
WoTryptophan, Y«Tyrosine, XsUnknown, *«Stop 
Codon, /apossible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWN^LRIvGPSKDTQRtlTNVSDkVSSBBGPBTGySL 
RRHVDRTS EG VLRNRKSHHYKKHYPNEDA P KSG TSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKBYRDDPFHQSHLPWLHSSHPGLEKISAIWEGNDCKKADMS 
VLB 1 SGMIMNRVNSH I PG IG YQI FGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVI 1 S FWRVS LVWI 
FFPLLCVAERTYKQVGIM *TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRJDDPFHQSHLPWLHSSHPGLBKISAI 
VWEGNDCKKADMSVLEISGM IMNRVNSHI PGIGYQIFGNAV5LI 
U3LTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVLSMV 
I IS FWRVSLVWI FFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MA5KVTDAIVWYQKKIGAYDQQIWBKSVEQRBIKGLRKK^>KkTA 

HVKPDLIDVDLVRGSAFAKAXPBS PWTSLTTKG I VRWFFPFFF 

RWWLQVTSKVIFFWIXVLYIi^QVAAIVLFCSTSSPHSIPLTEVI 

GPIWLMIXLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 

HREGDGSSTTDNTQBGAVQNHGTSTSHSVGTVFRDLWHAAFFtiS 

GSKKAKNS IDKSTBTDNGYVSLDGKKTVKSGEDGI QNHEPQCST 

IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 

RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 

QDSESARPESETEDVLWEDLLHCAECHSSCTSBTDVENHQ3NPC ' 

VKKEYRDDPFHQSHLPWLHSSHPGLBKISAIVWEGNDCKKADMS 

VLB I SGM IMNR VNS HI PG I G YQIFGNAVSLILGLTPFVFRLS QA 

TDLEQLT AHSAS EliYVIAFG SNED V I VLSMVI I S FWRVS LVWI 

FFFLLCVAE RTYKQVGIM * TS EGVLRNRKS HHY KKHYPNE DA? K 

SGTSCSSRCSSSRQDSESARPESETEDVLWBDLLHCAECHSSCT 

SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 

VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIFGNAVSLI 

I/3LTP FVFRLSQATDLEQLTAHSASELYVIAFGSNEDVI VLSMV 

IISFVVRVSLVWIFFFLLCVABRTYXQVGIM 


5450 


" ~ 813* 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLLIiAAG 
PADHLLLQLYSGHLQVRLVLGQE ELRLQTPAETLLS DS I PHTW 
LTVVEGWATLSVDGFLNASSAVPGAPLEVPYGIiFVGGTGTLGLP 
YLRGTSRPLRGCLHAATLNGRSIjLRPLTPDVHEGCAEEFSASDD 
VAIiGFSGPHSLAAFPAWGTQDEGrLEFTLTTQSRQAPLAFQAGG 
RRGDPIYVDIFEGHLRAWEKGQGTVLLHNSVPVADGQPHEVSV 
HINAHRLEI S VDQYPTHTSNRGVLSYLEPRGSLLLGGLDABASR 
HLQEHRLGLTPEATNAS LLG CMEDLS VNG QRRG LRE ALLTRNMA 
AGCRLEEEEYEDDAYGHYEAFSTLAPBAWPAMELPEPCVPEPGL 
PP\^ANFTQLLTISPLWABGGTAWLEWRHVQPTLDLWEAELRK 
SQVLFSVTRGAHYGEI^LDIIX^O^KMFTLLDVVNRKARFIHD 
GSEDTSDQLVLEVSVTARVPMPSCLRRGOTYLLP IQVNPVNDP ? 
HI I FPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 
SSGLP VERRDQPGEPATEFS CRELEAGSL VYVHCGG PAQDLTFR 
VSDGLQAS P PATLKWAIRPAI Q I HRS TGLRLAQGSAMP ILPAN 
LSVETNAVGQDVSVLFRVTGALQFGELQKHSTGGVEGAEHWATQ 
AFHOJIDVEO/SRVRYLSTDPQHHAYDTVBNIJUjEVQVGQE ilsnl 

SFPVTI qratvwmlrleplhtqntqqetlttahleatleeagps 

PP TFH YE VVQ AP RKGNIjQLQGTRL S DGQG FTQDD I Q AGR VTYGA 

TARAS EAVEDTFRFRVTAPP YFS P LYT FP IHI GGDPDAPVLTNV 

LLWPEGGEGVLSADHLFVKSLNSASYLYEVMERPRLGRI*AWRG 
TODKTTMVTS FTl^DLLRGRr J VYnHnnQl?TTffnnTT5Pini'T r Drvr»i? 

SSGDMAWEEVRGVFRVAIQPVNDHAPVQTISRIFHVARGGRRLL 
TTDD VAF S DADSG FADAQ LVLTR KDLL FGS I VAVD EPTR P 1 YR F 
TQEDIjR KRR VLFVHSGACRGWIQLQVSDGCHQATALLEVQAS EP 
YLRVANGSSLWPQGGQGTIDTAVLHIiDTNLDI RSGDEVHYH VT 
AGPRMGQLVRAJGQPATAFSQQDLIJX3AVLYSHNGS 1^ PEDTMAF 
S VEAGP VHTDATLQ VTI ALEG PIAPLKL VRHKKI YVFQGEAABI 
RRDQLEAAQ EAVP PAD I VFSVKS PPSAG YLVM VSRGALADBP PS 
LDP VQS FS QBAVDTGRVLYLHS RPEAWSDAFS LDVASGLGAP LE 
GVLVBLEVLPAAI PLE AQNFSVP EGGSLTLAP PLLRVSGPYFPT 
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SEQ 

ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rtcaictcu ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, DaAspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, CGlycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L=Leucine, MoMethionine, N=Asparagine , 
P= Proline, Q«Glut amine, R»Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=*Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKEDGPQARTLSAFSWRMVEEQLIRYV 
HDGSBTLTDSPVLMANASEMDRQSHPVAFTVTVLPVNDQPPILT 
TNTGLQMWEGATAPIPAEALRSTDGDSGSEDLVYTIEQPSNGRV 
VLRGAPGTEVRSFTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLI.LYRWRGPQLGRLFHAQQDSTGEALVNFTQAEWA 
GN I LYEHEM P PE PFWEAHDTLELQLS S PPARDVAATLAVAVS FE 
AACPQRPSHLWKNKGLWPEGQRARIWAALDASKLLASVPSPQ 
RSEHDVLFQVTQFPSRGQiiLVSEEPLHAGQPHFLQSQIAAGQLV 
YAHGGGGTQQDGFHFRAHIjQGPAGASVAGPQTSEAFAITVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVQ 

raphngflslvggglgpv^rftqadvdsgrlafvangss VAGI F 

QLSMSDGAS PPLPMSLAVDI LPSAI EVQLRAPLEVPQALGRSSL 
SQQQLRWSDREEPEAAYRLIQGPQYGHIiLVGGRPTSAFSQFQI 
DQGEWFAFTNFSSSHDHFR\n^AIARQVNASAVVWrvp rt.t .in/ 
WAGGP WPQGATJjRLDPT VLDAGELANRTGS VPRFRLLEG PRHGR 
WRVPRARTEPGGSQlVEQFTQQDLEDGRliGLEVGRPBGRAPGP 
AGDS LTLELWAQGVPPAVASLDFATEP YNAARPYS VALLSVPBA 
ARTEAGKPESS TPTGEPGPMASSPEPAVAKGGFLS FUSANMFS V 
1 1 PMCLVLLLIiAL ILPLLFYI»RKRNK7GKHD VQVIiTAKPRNGLA 
GDTETFRKVE PGQAI PLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


5*451 


i 1 


2274 


RDSSEQGRTGDTLGRPSACMDALKPPCIiWRNHERGKKDRDSCGR 
KNSEPGSPHSLEALRDAAPSQGLNPLLLPTKMLFI FNFLFSPLP 
TPALICILTFGAAIFLWLITRPQPVLPLLDLNNQSVGIEGGARK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNQ 
PYRWLSYKQVSDRAEYLGSCLLHKGYKSSPDQFVGIFAQNRPEW 
IISELACYTYSMVAVPLYDTjGPKAIVHIVNKADIAMVICDTPQ 
KALVLlGNVEKGFTPSLKVirLMDPFDDDLKQRGEKSGIEILSL 
raASWLGKEHFRKPVPPSPEDLSVlCFTSGTTCDPXGAMITHQN 
IVSNAAAFLKCVEHAYEPTPDDVAISYLPLAHMFERIVQAVVY9 
CGARVG FFQGD I RLLADDMKTLKPTLFP AVPRLLNR I YDKVQNE 
AKTPLKKFLIiKLAVSS KFKELQKGI I RHDSFWDKL I FAK I QDS L 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGD WTSGHVG V PLACNYVKLEDVADMNYFTVNNEGEVC I KG 
TNVFKGYLKDPEKTQEALDSDGWIiHTGDIGRWI*PNGTLKIIDRK 
xuHj.rxujKyLjtL,x iafi5KJ.enx YNRSQFVLQIFVHGESLRSSLVGV 
VVPDTDVLPSFAAKLGVKGSFEEJ^CXJNQVVREAILEDLQKiaKE 
SGLKTFEQVKAI FLHPEPFS IENGLLTPTLKAKRGELSKYFRTQ 
XD5LYEHIQD 


5452 


1833 


113 8 


SR VPS LCLS LSLS LS PSRE P VAGAPGCGTAGPPAMATL WGGLLR 
LGSLLSLSCLAIiSVLLLAQLSDAAKNFEDVRCKCICPPYKENSG 
HI YNKN I SQKD CDCLHWE PMPVRGPDVE AYCLRCE CKYEERSS 
VTIKVTI I IYLS ILGLLLLYMVYLTLVE P ILKRRLFGHAQL I QS 
DDD IGDHQPFANAHDVLARS RS RANVLNKVE YAQQRWKIjQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 


PSI PAAVPQSAPPE PHREETVTATATS QVAQQPPAAAAPGBQAV 
AGPAPSTVPSSTS KDR P VS QPSLVGSKEEPPPARSGSGGGSAKE 
PQBERSQQQDDIEELETKAVGMSNDGRFLKPDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTEI4MTSGTLKTYLKRFKVMKIKVL 
RS WCRQIIi KGLQFLHTRTPP I IHRDLKCDNIFITOPTOS VKIGD 
LGLATUG^FAKSVIGTPBFMAPBMYEEKYDESVDVYAFGMCM 
LEMATSE Y PYSECQNAAQI YRR VTSGVKPAS FDKVAI PEVKE 1 1 
EGCIRQNKDERYSIKDLLNHAFFQEETGVRVELABEDDGEKIAI 
KLWLRIEDI KKLKGKYKDNEAI E FS FDLERNVPED VAQEMVB S G 
YVCEGDHKTMAKAIKDRVSLIKRKREQRQL* 


5454 


111 


1520 


PSIPAAVPQSAPPEPHRBETVTATATSQVAQQPPAAAAPGEQAV 
AGPAPSTVPSSTS KDRPVSQPSLVG3KEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEEIiETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKS ERQR FJCEEAEMLKGLQHPNI 



325 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K»Lyaine, 
L»Leucine, M«Methxonine, tf*Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=:Valine, 
naiiyp^opnan, is lyrasme, x=unJcnown, *— stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVir' 
RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIPITGPTGSVKIGD 
LG LATLKRAS FAKS VI GTPEFMAPEM YEE KYDES VDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVA1PEVKEII 
EG C I RQNXDER YS I KDLLNHAFFQEETG VR VELAEEDDGE KI A I 
iuj w l»k J.2.U x kjujIUsK y kdneai E FS FDLERNVPEDVAQEMVESG 
YVCEGDff KTMA1CAIKDR VSLIKRKREQRQL * 


5455 


1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAILPLLFGCIOTFGLFRLLQ 
W VRGKAYLRNAWVI TGATSGLGKECAKVFTAAGAKLVLCGRNG 
GALEELI RE LTASHATKVQTHKP YLVTFDLTDS GA I VAAAAE IL 
■ QC FGYVD I LVNHAGI S YRGTIMDTTVDVDKRVMEINYFGP VALT 
KALLPSMIXRRQGHI VAISS IQGKMS I PFRSA YAAS KHATQAFF 
DCLRAEMEQYE I EVT V I S PG YIHTNLSVNAI TADGSRYG VMDTT 
TAQGRS P VEVAQD VLAA VGKKKKDVT LADLL PS LAVYLRTLAPG 
LFFSLKASRARKERKSKNS 


5456 


2 


2332 


CGAGLVAAGAVLVbYPASRAGERTRV?3SPAPSSLPLHSPGACX3 
TE VDMDPQRSPLLEVKGNIEIjKRPLI XAPSQLPLSGSRIiKRRPD 
QMEDGIiEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 
QKVSKKTGPRCSTAIATGLKNQKPVPAVPVQKSGTSGVPPMAGG 
KKPSKRPAWDLKGQLCDLMAELKRCRERTQTLDQENQQLQDQLR 
DAQQQVKAI^TERTTLBGHLAKVQAQAEQGQQEIiKNLRACVLEL 
EBRLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 
ALS S SQAEVAS LRQETVAQAAliLTEREERLHGLEMERRRIiHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGI*LLFPSGPGGPSDPPTRIj 
SLSRS DERRGTLSGAP AP PTRHDFSFDRVFPPGSGQDEVFEE I A 
MLVQSALDG YP VCI FAYGQTGSGKTFTMEGGPGGDPQLEGL I PR 
ALRHLFSVAQELSGQGWTYS FVAS YVE I YNET VRDLLATGTRKG 
QGGECEI RRAGPGSEELTVTNARYVPVSCEKEVDALLHLARQNR 
AVARTAQNERSSRSHSVFQLQISGEBSSRGLQCGAPLSLVDIAG 
SERLDPGLALGPG ERERLRETQAINS SLSTLGLVIMALSNKE S H 
VPYRNSKLTYLLQNSIXJGSAKMLMFVNISPLEENVSESLNSLRF 
ASKVEPSVLFGTAQSNRKVJKTDPDLCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


5457 


2 


1540 


DDFVERRRWTRTTCLVRSPPHVPVCGHACSWNGGSLDPLKGTPA 
LLRSAERLMRKVKKLRJLDKENTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFQ VQ KHS WDGLRS 1 IHGSRXYSGL I VNK 
APHDFQF VQKTDE SGPHSHRLY YLGMP YGSRENS LZjYSE I PKKV 
RKEALLLLSWK0MIJ5HFQATPHHGVYSRSEELLRERKRLGVFGI 
TSYDFHSESGI.FLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKTLRILYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 
GSKN P KI ALKLAEFQTDSQGKI VSTQEKELVQP FS SLPPKVE YI 
ARAG WTRDGKYAWAM FLDR PQQWLQLVLLP PAL F I PSTENEEQA 
ASIiCQSCPOBCPAVCGVRGGHQRLDQCS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVMEAQPEWUIAEV " 
KRLSHELAErTREKIQAAEYGLAVLEEKHQLKLQFEELEVDYEA 
IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 
VRKVLELQTELKQIJ^VLTNTQSENERLASVAQELKEINQNVE I 
QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 
VE FEGLKHE I KRLEEETE YLNSQLEDAI RLKEIS ERQLEEALET 
LKTEREQKNSLRKELSHYMS INDSFYTSHLHVSLDGLKFSDDAA 
EPNNDAEALVNGFEHGGLAKLPIiDNKTSTPKKEGLAPPSPSLVS 
DLdSEIiNI SE IQ KL KQQLMQMEREKAGLLATLQDTQ KQLEHTRG 
SLSEQQEKVTRLTENLSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVDIWGPEIIACKYKVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGR YEAEGQALTEKVS LLEKASRQDR ELLARLEKEI*KKVS 
DVAGETQGSLS VAQDEIiVTFS EBLANLYHHVCMCNNETPNRVML 
DYYREGQGGAGRTSPGGRTS PEARGRRSP XLLPKGLLAPEAGRA 
DGGTGDS S PS PGSSLPS PLS DPRREPKtll YNLI AI IRDQ I KHLQ 
AAVDRTTELSRQRIASQELGPAVDKDKEALMEEILKLKSLLSTK 
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/unino acia tKtgment containing signal peptide 

{A=Alanine, C=Cysteine, D=Aspartic Acid, E* 

Glutamic Acid, F=Phenyl alanine, GeGlycine, 

H=Histidine, I^Isoleucine, K=Lysine, 

L= Leu cine . MsMerhirtrHw** m~ ionovi«^« 
^ t ij uc i-iix*jiixiifs / iN^Asparagxne, 

P=Proline, Q=Glutamine, R«Arginine, 

S*Serine, T-Threonine , V-Valine, 

W«Tryptophan, Y^Tyrosine, X*»Unknown f *=Stop 

Codon, /-possible nucleotide deletion, 

\=possible nucleotide insertion) 








req ittlrtvzjkan kqtae vaiA^lkskyenekamvtetmMISIr 
nelkalkedaatfsslram fatrcdey i tqldemqrqlaaabde 

KKTLNSLLRMAIOOKLALTORTjKlil.PT.nHBfiTODrD a vtv a nvmv 
VATPSVSHTCACASDRASGTGLANQVFCSEKH3 IYCD 


54S<T" 


316 


1262 


rgghrlsgmasnfndivkqgyvrirsrrLgiyqrcwlvfkkass 
kgpkrlekfsderaayfrc^hkvtelnnvknvarlpkstkkhai 
giyfnddtsktfacesdleadewckvlqmecvgtrindrslgep 

DLIATGVEREQSERFNVYI^PSPNLGCYMGECALQITYEYICLW 

DVQJJPR VKL ISWPLSALRR YGRDTTWPTFEAGRMCE TGEGLFIF 

QTRDGEAI YQKVHSAALAI AEQHERLLQS VKNSMLQMKMS ERAA 

SLSTMVPLPRSAYWQHITRQHSTGQIiYRLQDVSSPLKIiHRTETF 
PAYRSEH 


5460 


45 


2057 


rpgcragelstgsrarervrnrvsapcgqdsrrcdpevlrgrsp 
glglaempscgactcgaaavrlitsslasaqrgisggrihmsvl 

GRLGTFETQ I LQRAPLRS FTETPAY FAS KDG I S KDGSGDGNKKS 

asegsskksgsgnsgkggnqlrcpkcgdlcthvetfvsstrfvk 

CE KCHHFFWLS EADS KKS 1 1 KE PESAAE AVKIiAPQQKP PPPP K 
KI YNYLDK YWGQS FAKXVLS YAVYNHYKR I YNNIPANLRQQAE 
VBKQT3 LTPRELEI RRREDEYRFT KLLQ I AGISPHGKALGASMQ 
QQVNQQIPQEKRGGEVLDSSHDDIKI*EKSNILLIiGPTGSGKTLL 
AQTLAKCLD VPFAI CDCTTLTQAG YVGEDI ES VI AKLLQDANYN 
VE KAQQGI VFLDE VDKI GSVPG I HQkRDVGGEGVQQGLIiKLLEG 
TIVNVPEKNSRKLRGBTVQVDTTNILFVASGAFNGLDRI ISRRK 
NEKYIiGFGTPSNI^KGRRAAAAADLANRSGESNTHQDIEEKDRIi 
LRHVEARDLI EFGM I PE FVGRLP WVPLHSLDEKTLVQI LTEPR 
t^VI PQYOJ^FSMDKCELNVTEDALKAlARLALERKTGARGIjRS 
IMBKLIiLEPMFEVPNSDlVCVEVDKEVVEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5461 


1481 


160 


±«rir**jfri\^j^v»KAKKWKKKKKPGAPE 

SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLGIHVFXVSCALPD 
SVLRRFVVRTMCAVl^LVARQEDSGlJlDHSVRVlilSNHVTPFDH 
NIVNLLTTCSTPLLNSPPSFVCWSRGFMEMNGRGELVESLKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
QVQRPLVS VTVSDASWVS ELLWSLFVP FTVYQ VRW LRP VH RQI#G 
EANE E F ALRVQQLVAKELGQTGTRLTPADKAEHMKRQRH PRLRP 
QSAQSS FPPSPCjPS PDVQLATIAQRVKEVLPHVPIjG VIQRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPSSGPV 
TPQPT ALTFAKSS WARQ ES LQER KQAL YEYARRRFTERRAQ EAD 


5462 


6£3 


3353 


KIKERQMSANKSPPSAQKSVLPTAIPAVLPAASPCSSPKTGLSA 
RLSNGS FS APS LTNSRGS VHTVS FLLQ I GLTRE S VT I EAQELS L 
S AVKDLVCS IVYQKFPECGFFGMYDKIliLFPHDMNSENl LQlil T 
SADE I HEGDLVEWLS ALATVEDFQ I R PHTL YVHS YKAPT FCD Y 
CGEMLmLVRQOLKCEGCGLNYHKRCAFKIPmCSGVRKRRLSiJ 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKM VMCRVKVPHTFAVHS YTRPTI CQYCKRLLKGLFRQGMQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMD IDN 
NDINSDSSRGLDDTEEPSPPEDKMFFIjDPSDLDVERDEBAVKTI 
SPSTSNNI PLMR WQS I XHTKRKSS TMVKEGWMVHYTSRDNIiRK 
RHYWRLDSKCLTLFQNESGSKYYKEIPLSEILRISSPRDFTNIS 
wuanrxiv.r aix luinv X r VijisNNyDoSllNPVIjAATGvGLDvAQS 
WEKAI RQALMPVTPQAS VCTS PGQGKDHKDLSTS I SVSNCQ I QE 
NVDIS TVYQ X F ADEVLGSGQFG X VYGQKHRKTGRDVA. IKVIDKM 

rfptkqesqlrnevailqnlhhpgivnlecmfetpervfwmek 

LHGIX^EMILSSEKSRIjPERITKFMVTQILVALRNLHFKNIVHC 

dlkpenvli*asaepfpovklcdfgfari igeksfrrswgtpay 
lapevlrskgynrsldmhsvgviiyvslsgtfpfnededindqi 
OKaafmyppnpwreisgeaidliwnllqvkmrkrysvdkslshp 
wlqdyqtwldlrefetrigery ithesddarwbihaythnlvyp 
khfimapnpddmeedp 


5463 


237 


1012 


LLSVTMTTSRCSHLPEVLPDCTSSAAPVVKTVEDCGSIiVNGQPQ 
YVMQVSAKIX3QLI^TVVRTIiATQSPFNDRPMCRICHEG3SQEDL 
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sequence 


(A*Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
H=Histidine, Ieisoleucine, K«Lyaine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=*61utamine, R=Arginine, 
S*Serine, T«Threonine, V»Valine, 
W=Tryptophan, Y-Tyroaine, X»Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEHWIiSSSNTSYCELCHPRFAVERKPR 
PL VEWLRNPGPQHE KRTLFGDMVCFLFI TPLATISGWLCLRGAV 
DHLHFSSRLEAVGL I ALTVAL FT I YLFWTLVS FR YHCRL YNEWR 
RTNQRVILLIPKSVNVPSNQPSLLGLHSVKRNSKETW 




195 


677 


SPSMNPRKKVDLKJCIIVGAIGVGKTSLLHQYVHKTFyEEYQTTIj 
GAS XLS KI I ILGDTTLKLQI WDTGGQERVRSMVSTFYKGSDGCI 
LAFD VTDLES PEALDI WRGDVLAKI VPMEQS YPMVLLGNKIDLA 
DRKYQSILENHLTES IKLSPDQSRSRCC 


5465 


5278 


3348 


kGDPREFIR^RE^LECDYVsAHLHEWIDLIFGYKQQGPAA^A- 
VNVFHHLFYEGQVDIYNINDPLKETATIGFINNFGQIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHIiDNLRPSLTPV 
XELKEPVGQI VCTDKGILAVEQNKVLI PPTWNKTPAWG YADLS C 
RLGTYBS DKAMTVYECLS EWGQ ILCAI CPNP KLVI TGGTSTWC 
VWEMGTSKEKAKTVTLKOAljtjGHTnTVTPlkTacr.avu t TT7er»co 
DRTCI I WDLNKLSFLTQLRGHRAPVSALCINELTGDI VSCAGT Y 
I H VWS INGNP I VS VNTFTGRS QQ 1 1 CC CMS EMNEWDTQNVI VTG 
HS DGWR FWRMEFLQVPETPAPEPAEVLEMQEDCPEAQ IGQEAQ 
DEDSSDSEADEQSISQDPKDTPSQPSSTSHRPRAASCRATAAWC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNPIEVRNYSRLKPGYRWERQLVFRSKLTMHTAFDRKDNAHPA 
EVTALGISKDHSRILVGDSRGRVFSWSVSDQPGRSAADHWVKDB 
GGDS CSGCS VRFSLTERRHHCPNrfior ."rctw new wi cpt vd t v t 

SS PVRVCQNCYYNLQHERGSEDGPRNC 


S466 


3 


992 


j^CAKAKAHASGRtVRlWR^ 

LGLAVGS YL VRRSR R PQ VTLLD PNE KYL LR LLDKTTVS HNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVT KVYLKG VH PKFPEGGKMS Q YLDSLKVGD WE FRGPSGL 
LTYTGKGHFNIQPKKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDI I LREDLEELQAR YPNRFKLW 
FTIiDHPPKDWA YS KGFVTAOMIRERX PAPGDDVLVLLCGP PPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 " 


3103 


4 


GEALRVGTRGCRRDLPDPQAR I FI QKKDLBEDES VTAAHLKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEBWECLQPDQRTL 
YRD VMLENYSHL I SLAGSS I SX PDVT TLLEQE KB PWMWRKETS 
RRYPDLELKYGPEKVSPENDTSEVNLPKQVIKQ I STTLGIEAFY 
FRNDSE YRQFEGLQGYQEGNINQKMI S YEKLPTHTPHASLI CUT 

HKPYECKECGKYFSCGSNLIQHQSIHTGEKPYKCKECGKAFQ2*H 
IQLTRHQKFKTGBKTFECKECGKAFNLPTOLNRHlOJrHTnrirTrT w 
ECKECGKS FNRSSNLTQHQS I HAGVKP YQCKECGKAFNRGSNLI 
QHQKIHSNEKPFVCKEOGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKIiVRHQKIHTGEKPFECRECGKAFSLIiNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FNRG AHL IQHQKI HSNE KPFVCR ECEMAFR YHOQL I EHSR I HTG 
DKPFECQDOGKA FNRGS S LVQHQSI HTGEKPYBCKECGXAFRL Y 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRS IHTCKKPF 
ECKECGKAFRLHMHLIRHQKLHTGBKPFECKECGKAFRLHMQLI 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 f 


225 


2976 


SFLTDLFQSLAQLENLCKQLYETTDTTTRLQAEKALVEFTNSPD 
CLSKCQLLLERGSS S YSQLLAATCLTKLVSRTNNPLP LEQRIDI * 
RNYVLNYLATRPXLATFVTQALIQL YAR I TKLGWFDCQKDDYVF 
RNA TTDVTRFLQDS VEYC I IGVTILS QLTNBINQVSATAFL I EA 
DTTHPLTKHRKIASSFRDSSLFDIFTLSGNLLKQASGKNLNLND 
ESQHGLLMQLLKLTHNCLNFDFIGTSTDESSDDLCTVQIPTSWR 
SAFLDSSTLQLSTIGRCBYEKTCALIiVQLFDQSAQSYQELLQSA 
SASPMDIAVQEGRLTWLVYIIGAVIGGRVSFASTDEQDAMDGEI, 
VCRVLQLMNLTDS RLAQAGNEKLELAMLS FFEQFRKI YIGDQ VQ 
KS S KL YRRLS EVLG LNDETMVLS VFIGKI I TNLK Y WGRCEP I TS 
KTLQLLNDLS IG YS S VRKLVKLSAVQFMLNNHTSEHFS FLGINN 
QSNLTDMRCRTTFYTALGRXLMVDLX3EDEDQ YEQ FMLPLTAAFE 
&VAQM FS TNS FNBQ EAKRT L VG LVRDLRG I AFAFKAKTS FMMLF 
EWIYPSYMPII^RAIELWYHDPAOTPVLKLMAELVHNRSQRLQ 
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| Amino acid segment containing signal peptide 
(A*Alanine, C*Cysteine, DoAspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H»Histidine, I=*Isoleucine, K-Lysine, 
L=Leucine, M-Methionine, N*»Asparagine, 
P«Proline, Q-Glutamine , R*Arginine, 
S-Serine, T=Threonine, V«Valine, 

1 W— TrVDtOOhan . YcjTvrnQ^ nt* Y_TtnL.»n. m * ~, . 

i »• 4.4,j^i.w^uau, * B iyrooinei AaunKnown , *=Stop 
Codon, /^possible nucleotide deletion, 
J \=possible nucleotide insertion) 








FDVSSPNGXLIiFRETSKMiTMYGNRiLTI^EVPKDQVYALKLkG 
IS ICFSMLKAAUSGS YVNFGVFRLYGDDALDNALQTPIKLliLS I 
PHSDIiUDYPKLSQSYYSLUSVLTQDHMNFIASLEPHVIMYILSS 
ISEGbTAI^TWCTGCCSOLDHIVTYIiFKQLSRSTKKRTTPLNO 
ESDRFLHIMQQHPEMIQQMLSTVLNIIIFEDC3WQWSMSRPLLG 
LILLNEKYFSDLRN3IVNSQPPEKQQAMHLCFENLMEGIERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


£653 


DQEFETSLVPWHLPMGWLC^LLFPVSCLVI^VASSGNMKVU} 
BPTCVSDYMSISTCBWKMNGPTNCSTELRLLYQLVFLLSBAHTC 
VPENNGGAGCVCHLLMDDVVSADNYTLDLWAGQQLLWKGSFKPS 
EHVKPJ^PGNLTVHTNVSDTLLLTWSNPYPPDNYLYNHI,TYAVN 
IWSENDPADFRIYNVTYLEPSLRIAASTLKSGISYRARVRAWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CYVS ITKI KKBWWDQI PNPARSRI#VAI I IQDAQGSQWE KRS RGQ 
EPAKCPHWKNCLTKI^PCFLEHNMKRDEDPHKAAKEMPFQGSGK 
SAWCPVEISKTVLWPESISWRCVELPEAPVECE3EEEVEEEKG 
SFCASPESSRDDFQEGREGIVARIiTESLFLDLLGEENGGFCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKBAPPWGKEQPLHLEPS 
PPAS PTQSPDNLTCTBTPLVIAGNPAYRSFSNSLSQSPCPREIjG 
PDPLIJUU4LEEVEPEMPCVPQLSEPTTVPQPEPE'fWEQII,RRW 
LQHGAAAAPVSAPTSGYQEFVHAVEQGGTQASAVVGLGPPGEAG 

ykafssliassavspekcgfgassgeegykpfqdlipgcpgdpa 
pvpvplftfgldrepprspqsshlpssspehlglbpgekvedmp 

KPPLPQEQATDPLVDSLGSGIVYSALTCHIiCXSHliKQCHGQEDGG 

qtpvt^aspccgcccgdrasppttplrapdpspggvpleaslcpa 

SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFV5VGPTYM 
RVS 


547D 


17 


1418 


XACRIRTSIiMRG I AAVKKDAVEMLAS YGLA YSLMKFFlVISPMSDF 
KNVGLVFVNS XRDRTKAVLCMWAGAIAAVFHTL I AYSDLGY YI 
INKLHHVDES VGS KTRRAFL YIiAAFPFMDAMAWTHAG I LLKHK Y 
S FLVGCAS IS DVIAQWFVAILLHSHLECREPLLI P I LSL YMGA 
LVRCTTLCLG YYKNIHDI I PDRSGPBLGGDATIRKMLSFWWPLA 
LILATQRISRPIVNLFVSRDLGGSSAATEAVAILTATYPVGHMP 
YGWLTEIRAVYPAFDKNNPSNKLVSTSNTVTAAHIKXFTFVCMA 
LSLTLCFVMFWTPNVSEXIL IDI IG VDFAPAELCWPLH I FS FF 
PVPVTVRAHLTGWl^TLKKTFVLAPSSVLRIIVLIASLVVLPYL 
GVHGATLGVGSLLAGFVGESTMDAIAACYVYRKQKKKMENESAT 
EGEDSAMTDMPPTBSVTDI VEMRBENE 


5471 

5472 " 




658 


~ Sljn * r rvj ryiwww\ x / wiA/WUi V tiWAAAAAQQGGGGE PRRTEGV 
GPGVPGEVEMVKGQPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAI KKI S PFEHQTYCQRTLREIQ ILLRFRHBNVI G IRD I LR 
ASTLBAMRDVYI VQDLMETDLYJdiLKSQQLSNDH I C YFL YQ I LR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTE YVATRW YRAPE IMLNS KGYTKS I DIWS VGC ILAEMLS 
F * "Vy« viiiM l*j j. LAio tf a v« U uNuI XNMKARNYXjQ SL 
PS KTKVAWAKLFPKSDSKALDLLDRMLTFNPNKR ITVEEALAHP 

YLBQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFQ 
PGVLBAP 




1469 


753 
2119 1 


liYVMAR YIiSDEE VAVS I DRLCKANGRS PS I PFUT VRI PGRARVR — 
DPQALWIFGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSDKM PGR VVTLLEDHEGCTWGVAYQVQGEQVS KALKYLNVR EA 
VXXSGYDTKEVTFYPQDAPDQPLKAIiAYVATPQNPGYLGPAPEEA 
IATQIIiACRGFSGHNLEYLIiRVRDVMQLOGPQAQDEHLAAIVDA 
VGTMLPCFC PTEQALALV 


5473 


3 


1 

] 

1 ] 


FMNVKLLIQDIiEDIEQRVPVMDAQYKIITKTAHtiTKESPQEB^ 
KSMFATMSKLKEQLTKVKECYS PLL YESQQLL I PLBELEKQMTS 
FYDSIX»KINEI ITVLEREAQSSALFKQKHQELLACQENCKKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRLQRQIADIHVAFQSMVKK 
TOD WKKHVE rNSRLMKKFEES RAELEKVLR IAQEGIiEEKGDPEE 
LLRRHTEFFSQIaDQRVLNAFLKACDELTDILPEQEQQGLQEAVR 
<ttHKQWKDIiQGEAP YHLLHIiKI DVE KNRFLASAEECRTELDRET | 
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SEQ 
ZD 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoiduc of 
amino acid 
sequence 


raiu.iiu acia segment, containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenyl alanine, G**Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
LoLeucine, ^-Methionine, N-Asparagine, 
P»Proline, Q-Glutamine, R=Arginine, 
S-Serine, ^Threonine, V* Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *«=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RDP VRDTPGTCHVTLKELRAAIDS TYRKLMEDPDKWKD YTSRFS 
EFSSWISTNETQLKGIKGEAIDTANHGBVKRAVKBIRNGVTKRG 
ETLS WLKS RLKVLTE VS SENEAQKQGDELAKLS S S FKALVTLLS 
EVE KMLSNFGD CVQYKE I VKNSLEEL I SGS KE VQEQAEKI LDTE 
NL F EAQQLLLH HQQKTKR ISAKKRDVQQQ I AQAQQGEGGLPDRG 
HB ELRKLESTLDGLERS REf?m?RP t nxrvr v u t?r> ccttkt vr»m« « m 
YLFQTGSSHERPLS FSSLES LSSELEQTKE FS KRTES I AVQAEN 

LVKEASEIPLGPC^KQr,LQ0^AKSIKEQVKKLEDTr4EBEyvrDK 
S 


5474 


2 


780 


j. i u v *\^u^ru3.njt\j vrio n LS rXn c Ao i£U.MAc VKS 3 WLLRQS T I 

LKRWKKNWFDLWSDGHL X YYDDQTRQN I EDKVHM PMDC IN I RTG 
QECRDTQPPDGKSKDCMLQ I VCRDGKTI SLCAESTDDCLAWKFT 
LQDSRTNTA Y VG S AVMTDETS WSS PPP YTAYAA? APE VGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGCi3 
PAKQVr XRER YRDNDSDLALGMLAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSI/TCQTTPPPSSPCLLHSPETFIHTMPPNLTGYYRF 
VSQKNMEDYIiQALNl SlAVRKIAI#LLKPDKE I EHQGNHMT VRTL 
STFRNYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRKVR 


5476 


192 


1457 


o u*n*LiuiJVi< WbR Tg VS S LRPEKQSETS IHQYL VDEPTLS WS R 
PSTRASEVLCSTNVSHYBLQVEIGRGFDNLTSVHLARHTPTGTL 
VTI K ITNLENCMEERLKALQKAVILSHFFRHPN I TT YWTVFTVG 
SWLWVISPFMAYGSASQLLRTYFPEGMSETLIRNIjFGAVRGLN 
YLHQNGCIHRSIKASHILISGDGLVTLSGLSHLHSIjVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLKGYNVKSDI ys vgitacel 
ASGQVPFQDMKRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSG IGES VLVS SGTHTVNSDRLHTPSS KT FS PAFFSLVQLC 
«w«u* ojwrsiwauaanvi r ^U"«*J&o01«1JjSI*JjPPaYNKPSI 
SliPPVLPWTEPECDFPDEKDSYWEF 


5477 
5478 * 


3 


1044 


RGNSRLRYSHEDEIiQLPRLPELFETGRQIiLDEVEVATEPAGSRI 
VQEKVFXGLDLLBKAAEMLSQLDLFSRWEDLEEIASTDLKYLLV 
PAFQGALTMKQVNPSKRLDHLQRAREHFINYLTQCHCYHVAEF3 
LPKTMNNSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 

HRLSAMKSAVESGQADDERVRE YYIiLHLQRWIDI SLEE IES I DQ 

E I KILRERD SSRE JLS TQMC QDrtrDDmrirntJTT rro-Mwn aam mr, » 
w * iv a uiuMWiaianimo i OMdoK^iuu^r V W't 1 IiTKNMAQAKVFQ A 

GYPSLPTMrVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 




2 


835 


KTVR I WVPN VKGES WFRAHTATVRS VHFCSDGQS FVTASDDKT 
VKVWATHRQKFL FSXiS QHXNW VR CAKFS PDGRL I VSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRUjQH YQLHSAAVNGLS FHP SGN YL I TASSDSTLKILDL 
MEGRLL YTLHGHQGPATTVAFS RTCEYFAS OCZ ^nnnvMUuvcup 
D I GDHGEVTKVPRPPATLAS SMGNLTVS ILEQRLTLEEDKLKQC 
LENQQLI MQRATP 


5479 


2 


835 


KTVRIWVPNVKGESTVFRAHTATVRSVKFCS^ 

VKVNATHRQ K FL FS LS QHINWVR CAKFS PDGRL I VSASDDKTVK 

LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKW 

D VRTHFXLQHYQLHS AAVNGLS FHPSGNYL I TASSDSTLKILDL 

MEGRLLYTLHGHCX3PATTVAFSRTGBYFASGGSDEQVMVWKSNF 

DIGDHGEVTKVPR P PATLASSMGNLTVS I LEQRLTLEEDKLKQC 

LENQQL I MQRATP 


5480 - 


444 


1952 


LS LTS RMEE AELVKGRLQAITDKRK I QBE 1 5 Q KRLK I EED KLKH ' 
QHLKKKALREKWLLDGISSGKEQEEMKKQNQQDQHQIQVLEQSI 
LRLEKEIQDLEKAELQISTKEEAILKKLKS IERTTEDI IRS VKV 
EREERAEESIEDIYANIPDLPKSYIPSRLRKBINEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSS IPLPSDDFKGTGI KVYDDGQ 
KS VYAVSSNHS AAYNGTDGLAP VE VEELLRQAS ERNS KS PTE YH 
EPVYANPFYRPTTPQF^TVTPGPlTFQERIKIKTKrGLGICVNESI 
HNMGNGLSEERGNNFNHISPIPPVPHPRSVIQQAEEKLHTPQKR 
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SBQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, CoCysteine, D=Aspartic Acid, E« 
Glutamic Acid, P= Phenyl alanine, G^Glycine, 
H=*HiStidine, Islsoleucine, K=Lyair.e, 
L= Leucine. MsMethionine NB&anaraff{nA 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T«Threonine, V=Valine, 
W= Tryptophan, YaTyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWEESNVMQDKDA1>SPKPRLSPRBTIPGKSEHQNSSPTCQE" 
DEEDVRYNI VHSLPPD INDTBP VTMI FMG YQQAEDSEEDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPLPRKRSEASPHEKHXS 


5481 


3 


1422 


NSPGSVCLCQCVCPSLLHCIiPPLLLLLLLPLLLHESPQPPAtRV 

VAT^QnPMPMMTfWr^V DVr.TWVOPVT'DVDT^B VOvconrntrDnnntT tr 
vAtoa UKSt r wn ^nU-tvr V L» l v?y K rJvl KKkJjB KE KFH PTVFRDTLV 

QGLNEAGDDLEAVAXPLDSTGSRLD YRRYADT LFDI LVAGSMLA 

PGGrRIDDGDKTKMTNHCVFSANEDHETIRNYAQVFNKLIRRYK 

YLEKAFEDEMKKLLLFLKAFSETEQTKIiAMLSGILLGNGTLPAT 

ILTSLFTDSLVKEGIAJ^FAVKLPKAWMAEICDANSVTSSLRKAN 

LDKRLLELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 

KELQKELQERLSQECPIXBWLYVKEEMKRNDLPETAVIGLLWT 

CIMNAVBWNKKEELVAEOALKHLKQYAPLLAVFSSQGQSELI LL 

\j xvv yc iti jl nrniUiruAl v vj-te i KAL/ v JjocKAJ. JjKWY KEAK V 

AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


^482- 


14*2 


528 


THWMTGM CYAPHQVLS Y INGVTTSKPGVSLVYSMPSRNX»SLRIi 
EGLQEKDSGP YSCS VNVQDKQGKSRGHS IKTIiEUOVLVPPAPPS 
CRLQGVPHVG ANVTLS CQS PRSKPAVQYQWDRQLPS FQTFFAPA 
LDVXRGSLSLTmiSSSMAGVYVCKAHNEVGTAQam'LEVSTG P 

u/VAV VAUSAV VtjTJbVOLASijijAbljV buYHKKGKALEEPANDIKEDA 

IAPRTLPWPKSSDT I S KNGTLSS VTSARALRPPHGP PRPGALTP 
TPSLSSOALPSPRLPTTDGAHPQPISPIPGGVSSSGIiSRMGAVP 
VMVPAQSQAGSLV 


5483 


1 


788 


F FFFKGCRAGRGNES D YRKLEEMHQRFLVS ERS KDDLQLRLTRA 
ENR I KQLB TDS S EE I SR YQEM I QKL QNVLES ERENCGL VS EQR L 
KLQQENKQLRK3TESLRKIALEAQKKAKVKI STMEHEFS I KERG 
FBVOLREMEDSNRNSIVELRHLliATQQKAANRWKEETKKLTESA 
F I R J NNLKSELS RQKLHTQELLSQLEMANEKVAENE KL I LEHQE 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS " 

ESDQDERGDSGQPSNKELFGDDSEDEGASHHSG3DNHSERSDNR 

SEASERSDHBDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSB 

AEGSEKAHSDDEfCWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 

LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 

SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

NS GTMDLFGG ADD I SSGSDG E DKPPT PGQ P VDENGLPQDQ QE EE 
PI PETR I EVE I PKVNTDLGNDLYFVKLPNFLS VE PRP FDPQY YE 
DEFEDEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEIKESNAR 
IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 
VFKTKLTFRPHSTOSATHRKMITiSLADRCSKTQKIRIIiPMAGRD 
PECQRTEMIKKEEERLRAS IRRESQQRRMREKQHQRGLSAS YLE 

PDRYDE EE EGEES T SUVA T KNTJYTf CJG TP PPPZi W T VQ cndnpft C c 
EDKAQRLL KAKKLTS DE VRPKLFNSRGLS CTQBPTALNBELTDQ 
AGTN 


5485 


161 


1074 


KRKIIiSSMMDSEAHEKRPPItiTSSKQDISPHITNVGEMKHYLCG 
CCAAFJJNVAITFP I QKVLFRQQLYG 1 KTRDAILQLRRDGFRNLY 
RGILPPL^KTITLAIjMTOLYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTTEAI FTPLERVQTLTXJDHKHHDKFTN7YQAFKALKCHGI 
GEilfRGLVPIT^RNGLSNVIJFGLRGPIKBHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINWKTRIQSQIGGEPQS FPKVFQKI 
WLERDRKLINLFRGAHLNYHRSLISWGI INATYBFLLKVI 


5486 


1404 


142 


I PGSTISWSPAAARGLSVCRCCRIiHPASAMDLFGDLPEPERSPR 
PAAGKEAQKGPLLPDDLPPASSTDSGSGGPLLFDDLPPASSGrJS 
GS LATS I SQMVKTEG KGAKRKTS EEE KNGS EEL VEKKVCKAS S V 
IFGLKGYVAERKGEREEMQDAHVI LNDI TEECRP PSS LITRVS Y 
FAVFDGHGGIRASKPAAQNLHQNLIRKFP KGDV1 SVEKTVKRCL 
LDTFKHTDEEF JjKQAS SQKPAWKDGSTATCVLAVDN I LYIANLG 
D8RAILCRYNEBSQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 
DGRVLGVLEVSRS IGDGQYKRCGVTS VPD I RRCQLT PNDRF ILL 
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ID 
NO: 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=?Aspartic Acid, B= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K»Ijysine, 
L=beucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V-Valine, 
W«Tryptophan, Y»Tyrosine, XaUnJcnown, +»Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








ACDGLFKVFTPEEAVNFIWCLEDEKIQTRBGKSAADARYEAAC 
NRLANKAVQRGSADNVTVMWRIGH 


5487 


535 


182 


AVSLEQIRGLQTPAP VPLP LQP CPSNCDMERVTLALLLI*AGLTA 
LEANDPFANXDDPFyyDWKNLQLSGLIOGGLLAIAGIAAVl^SGK 
CKCKSSQKQH3P VPEKAI PLITPGSATTC 


5468 


1072 


259 


AMAASGE PQRQWQEE VAAWWGSCMTDLVS LTS RL PKTGETIH 
GHKFFI GFGGKGANQC VQAARLGAMTSMVCKVGKDS FGNDY IEN 
LKQNDISTEFTYQTKDAATGTASIIVNNEGQNIIVIVAGANLLL 
NTEDLRAAANVI SRAKVMVCQLE ITPATSLEALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAE I LTGLTVGS AADAGE 
AALVLLKRGCQWIITLGASGCWLSQTEPEPKHIPTEKVKAVD 
TTV3FKI 


5489 


81 


893 


GKG P VAAFI DQSNI FLTPPXI FLGQWREEPKMPLLlAGE TEpLK 
LE RD CRS P VE P W AAAS PDLALACLCHCQDLS SGAFPNRGVLGGV 
IiFPTVEMVI KVFVATSSGS IAIRKKQQE WGFLEANKI DFKELD 
IAGDEDNRRWMRSNVPGEKKPQNGI PLPPQI FNBEQYCGDFDSF 
FSAKEENI I YS FLGLAPPPDSKGSEKA3EGGETEAQKEGS EDVG 
NLPEAQEKNEEEGETATEET3EIAMEGABGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKG P VAAF I DQ SN I FLTD P Kl FLGQWR E E PKM P LLLLGE TE P LK 
LERDCRS PVE P WAAAS PDIALACLCHCQDLSSGAF PNRGVLGG V 
LFPTVEMVI KVFVATSSGS IAIRKKOQEWGFLEANFCIDFKEIjD 
I AGDEDNRRWMRENVPGBKKPQNGI PLPPQI FNEEQYCGDFDS F 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPBAQEKNEBEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRLS LGPTGAQARDFD W WARPPSR P YTQS KEDR PDTEGRS B 
QGDMASSFLPAGAITGDSGGELSSGDDSGEVEFPHSPEIEBTSC 
LAELFEKAAAHLQGLIQVASREQLLYLYARYKQ VKVGMCNT PKP 
SFFDFEGKQKWBAWKAliGDSSPSQAMQEYIAVVKKLDPGWffPQI 
PBKKGKEANTGFGGPVISSIjYHEETIREEDKNIFDYCREKNIDH 
ITKAI KS KNVDVNVKDEEGRALLHWACDRGHKELVTVLLQHRAD 
INOQDNEGQTALHYASACEFLDIVBLLLQSGADPTLRDQDGCLP 
EEVTGCKTVSLVLQRHTTGKA 


54 92 


3 


1896 


AS KN PLS AVCTTG IMS SLAVRDPAMDRSLRS VF VGhf t P YEA'rEE 
QLKDIFSEVGSWSFRIiVYDRBTGKPKGYGFCEYQDQETALSAM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAPI IDSPYGDP 
IDPEDAPES I TRAVASL PPEQMFELMKQMKLCVQNSHQEARNML 
LQNPQLAYALLQAQVVMRIMDPEIALKIIiHRKIHVTPLIPGKSQ 
SVSVSGPGPGPGPGLCPGPNVLI^NQQNPPAPQPQELARRPVKDI 
PPLMQTP I QGGI PAPGP I PAAVPGAGPGSLTPGGAMQ P QLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGIiPPRGLLGDAPNDPR 
GGTIjLS VTGE VEPRG YLGPPHQGPPMHHASGHDTRGPS SHEMRG 
GPLGDPRLL I GEPRG PM I DQRGLPMDGRGGRD8RAMETRAMETE 
VLETRVMERRGMETCAKETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGIQGPGP INIGAGGP PQGPRQVPGI SGVGNPGAGMQGTG I Q 
GTGMQGAG I QGGGMQGAG I QG VS IQGGG X QGGGI QGAS KQGGSQ 
PSS FSPGQSQVTPQDQEKAALIMQVLQLTADQIAMLPPEQRQS I 
LILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPE3PRKPGRLTQALNSPLTWEHVWICVPGGTPDCL 
TDT FR VKRPHL RRS ASNGHVPGTP VYRE KEDM YD EI IELKKSLH 
VQKSDVDLMRTXLRRLEEENSRKDRQIEQIiLDPSRGTDFVRTLA 
EKRPDASWVINGLKQRILKLEQQCKEKDGTISKLQTDMKTTNLE 
EMRLA>ffiTYYEE\mRLQTLLASSETTGKKPLGEKKTGAKRQKKM 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERI>RGAVRDLKEERTALQEQI*LQRDLEVKQLL 
QAKADLEKELECAREGEEERREREEVLREEIQTLTSKLQELpQEM 
KKEEKEDCPBVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 
RSPCSDGRRDAAARVLQAQWKVYKHKKKKAVLDEAAVVLQAAFR 
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amino acid 
residue of 
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to first 
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sequence 


* b *^w iu«sj4u cvutainiriy sigxia-i peptide 
lA*Alanine, (^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Zieucine, M=Methionine, NoAsparagine, 
P«Proline, Q=Glut amine , R=Arginine, 
S=Serine, T»Threonine, V.Valinc, 
W»Tryptophan, Y-Tyrosine, X=Unknowr., *«Stop 
Codon, /^possible nucleotide deletion, 
\ ^possible nucleotide insertion) 


" $494 






GHLTRTKLLAS KAHGS E P PS VPQ IiPDQS S P VpR VPS> P i A&VTGS 

PVQEEAIVI IQSALRAHLARARHSATGKRTTTAASTRRRSASAT 

HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 




71 


536 


RSKAKIGTPTREVPSTDMKVRRESSSSLTHRPAPSPATPRLLGT^ 
RRVLLGVS EGTGCADAMELVLVFLCS LLAPMVLASAAE ICE KEKD 
PFH YDYQTLRIGGLVFAWLFS VG I LL I L SRRCKCS FNQK PRA P 
GDEEAQVENIilTANATEPQKAEN 


5495 


273 


2168 


US> LLLI QVDTM PFTLHLRSRL PSA I RSLI LQ KKPNIR NTS SMAG 
ELRPASLWLPRSLW?APERFCQVNTGPLPLLGQSEPEKWMLPP 
QGA I S ETRMGHPQFWK YEFGACTGSLASLEQ YS EQLKDM VAFFL 
GCS FS LEEALEKAGLPRRDPAGHSQAGAYBCTTVPCVTHAGFCCP 
LWTMR P I P KD KLEG L VRACCS LG GEQ GQFVHMGDPE LLG I KEL 
imrn* \xvm*i v v.i'r'Lais vif vr WJb'isi'Jjl iJjGAVSSCETPLAFASIPG 
CTVMTDLKDAKAPPGCLTPBRIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
GFPTHFNHEPPEETDGPPGAVALVAFJLQALEKEVAIIVDQRAWN 
LKQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAFLCKNGDPQT 
PRFDHLVAIERAGRAADGNYYNARKMNIKHLVDP ID0LPLAAKK 
IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVSNWGG YALACAL Y IL YS CAVHS Q YLRKAVG PSRAPGDQA 
WTQALPS VI KBE KMLG I LVQHKVRSGVSGI VGMEVDGLP FHNTH 
AEMIQKLVDVTTAQV 


5496 


3 


2408 


(iUTKMHE I YKGNI T PQLNKNTLKTS AATDVWAVYF3QF W I D Y3G 
MKSGKGRPISPVDSPPLSIWICXJPTRYAESQKEPQTCNQVSLNT 
SQSBSSDLAGRLKRKKIiLKEYYSTESEPLTNGGQKPSSSDTF^R 
FSPSSSEADIHbLVHVHKHVSMQINHYQYLLUiFJjHESIil LLSE 
NI^KDVEAVTGSPASQTSICIGILLRSAELALLLHPVDQANTLK 
SPVSESVSPWPDYIiPTENGDFTiSSKRXQISRDINRIRSVTVNH 
MSDKRSMSVDLSHI PUCDPLLFKSASDTNLQKGIS FMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYR3 
DSNILSFDSDGNQNILSSTLTSKGNBTIESlFKAEDLXiPEAASL 
uu j.oivc»e» i e r v K.x ijAoyoaJjoGKFKERCPPNIiAPLCVS YKN 
MKRS S SQMS LDTI S LD SM I LE EQLLE SDGS DSHM FLEKGNKKNS 
TINYRGTAESVKAGANI^I^GETSPIJAISTNSEGAQENHDDIjMS 
VWFKI TG VNGB I DIRGEDTE ICLQVNQVTPDQLGN I SLRH YkC 
NRPVGSDQKAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFIiQ 
CHlKNFSTEFLTSSl^IQUFLEDETVATVmMKlQVSmKWL 
KDDSPRSSTVSLEPAPVTVHIDHLVVEPJSDDGSFHIRDSHMLNT 
GNDLKENVKSDSVLLTSGKYDLKKQRSVTQATQTSPGVPWPSQS 

ANFPEFSFDF^EQLMEENKSLKQEIiAXAK>lALAEAHLEICDALL 
HHIKKMTVE 


5497 
5498 - 


1821 


3308 


SISKLLKRRSNIDAYLLSNSCAFFAPRLFSLASQtlREQQSPNV 
CFI YKYSGFPSLECQCHFVS PHSSCYINFFS FPP P FFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVT«5VHT Dnvao 
YYTIGPGMFPSSQ I PS WKDWAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEBAQRPRSMTVSAATRPGEEMEACEELA 
LALSRGLQLDTQRSSRDSLQCSSGYSTQTTTPCCSEDTIPSQVS 
DYDYFSVSGDQEADQQBFDKSSTI PRNSD ISQS YRRMFQAKRPA 
S TAGLPTTLGPAMVTPGVATIRRTPSTKPS VRRGTIGAGPI p I K 
TPVIPVKTPTVPDLPGVLPAPPDGPEERGEHSPESPSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPBEHRQAIPBSEAEDQER 
EPPS ATVS PGQ I PE SD PADLS PRDTPQGEDM&NAIRRG VKLKKT 
TTNDRSAPRFS 




2434 


1492 

1 


ix.thqeiftge:<pcbcgkasiqmshlsqqkiysgenpfack\/cg 

KVFSHKSNLTEHEHFHTREKPFECNECX5KAFSQKQYVIKHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 
QKSNLIRHQRTHTGEKPFVCKECX3KTFSGKSNLTEHEKIHIGEK 
PFKCS ECGTAFGQKKYLIKHQNIHTGBKP YECMEGGKAFSQRTS 
L I VH VRIHSGDKP YECNVCGKAFSQS SS LTVHVRSHTGE KP YGC 
MECGKAFSQFSTLAIjHLRIHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 
NO: 


& ^ ^« v. L. C^vl 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

IlULlCUUlUc 

location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F* Phenylalanine, G-Glycine, 
H=Histidine, t«Isoleucine, K»Lysine, 
L«*Leucine, M=Methionine , N^Asparagine , 
P-Proline, Q=Glutamine, R^Arginine, 

W=Txyptophan, Y=Tyrosine, X=*Un known, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


5499 


324 


926 


GFGQ IGRGHKI TTYPFS PRKSGRXGMAQSQGWVKRYI KAFCKGF " 
FVAVPVAVTFLnRVACVA.RVFCSAQMt~0<5T >j - Dn/-»crkGcmnrr r *ytt 

WKVRN FEVHRGD I VS LVS P KNPEQKI 1 KRVIALEGDI VRT IGHK 
NRYVKVPRGH 1 WVEGDHHGHSFDSNS FGPVS LGLLHAHATH I LW 
PPERWQ KLESVLP PERLP VQREEE 


5500 


1978 


1286 


KPDWRLQNliPPRLYLWRSSRFGFGHLKKRLQMDFKIEHTWDGFP 
VKHEPVFIRLNPGDRGVMMDISAPPFRDPPAPLGEPGKPFNELW 
DybVVfciAhFLNDITEQYIiEVELCPHGQHLVLLLSGRRWWKQEL 
PLS FR VS RGETKWEGKAYLPWS YFPPNVTKFNS FAIHGS KDKRS 
YEAI»YPVPQHBLQQGQKPDFHCI#EYFKSFNFNTLIiGEEWKQPSS 
DLWLIEKCDI 


5501 


2927 


2226 


CRPPVSAR\^Ap6HQGiAVGGSGRRPARVEWDAAARPSSRPFSLP 
AA1MLALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGQFSEDMI PTVGFNMRKVTXGNVTIKIWDIGGQPRFRSMWERY 
CRG VNAI VYM IDAADREKI E ASRNEIiHNLLDKPQLQG I PVLVLG 
NKRDLPNALDBKQLIEKMNLSAIQDREICCYSISCKEKDWIDIT 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFPVWVPERTALLTCPLGAAPGSSfeEAPGIAGPPNSTAMSKL 
GKFFKGGGS SKSRAAPSPQEALVRLRETEEMLGKKQEYLENR I Q 
RE I AIAKKHGTQNKRAALQALKRKKRFE KQLTQIDGTLST I E FQ 
REAliENSHTOTEVLRNMGFAAKAMKS VHElSMOLETKIDDLMQE I T 
EQQDIAQE I SEAFSQRVGFGDDFDEDEbMAELEELEQEELNKKM 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 


216. 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFLLPKLTSKKEVDQAlKSTA 
BKVLVLRFGRJDEDPVCLQLDD I LSKTSSDLSKMAAI YLVDVDQT 
AVYTQYFDISYIPSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVNS 


5504 


58 


3563 


QLSFSFQAPVTFDDITVYLLQBEVJVLLSQQQKELOGSNKLVAPIi " 
GPT VAN PELFRKFGRGPEPWLGS VQGQRSLLEHHPGKKQWG YMG 
EMEVQGPTRESGQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRS IQKSWFVQFPWLIMNEEQTALFCSACREYPS IRDKRSRL 
IEGYTGPFKVETLKYHAKSKAHMFCVNALAARDPIWAARFRSIR 
DPPGDVLAS PEPLFTADCP I FYPPGPLGGFDSMAELLPSSRAEL 
EDPGGDGAIPAMYLDCISDLRQKEITDGIHSSSD INIL YNDAVE 
SCI QDP SAEGLSEE VPWFEELPWFEDVAVYFTREE WGMLDKR 
QKELYRDVMRMNYBLliASLGPAAAKPDLISKLERRAAPWlKDPW 
GPKWGKGR PPGNKKNVAVREADTQASAADSAIjLPGS PVEARAS C 
CSSS ICEEGDGPRR I KRTYRPRS IQRSV7FGQFPWLVIDPKETKL 
FCSAClERPWIiHDKSSRLVRGYTGPFKVETIiKYHBVSKAHRLCV 
NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 
DFEKILQLLQSTGTVILGKYP»NR I EACTQPIKYISETLKREILBD 
VRNS PCVS VhLDSSTDAS EQACVG I YIR YFKQMEVKE SYXTLAP 
LYSETADGYFETIVSAiDELDIPFRKPGMWGLGTDGSAMLSCR 
GGI.VEKFQEVIPQLLPVHCVAHRI1HIAWDACGSIDLVKKCDRH 
xttl v c j W^S«-^i^o JjQEG AAPIjEQB 1 1 RL KDLNAVR WV ASR 
RRTLHALLVSWPALARHLQRVAEAGGQIGHRAKSMLKLMRGFHF 
VXFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 
LRHQAGPREEEFHASFKDGRLHGI CLDKLEVAEQRFQADRERTV 
LTGIEYLQQRFDADRPPQLIOJMEVFDTMAWPSGIELASFGNDDI 
LNliARYFECSLPTGYS EEALLEEWLGLKTI AQHLPFS MLCKNAL 
AQHCRFPLLS KLMAVWCVPISTSCCERGFXAMNRIRTDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
CAQVPARS PAS ARLR KEEMGAL YVEE PRTQ KP P ILP SREAAEVL 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCS PRS LSAAKM5NRNNNKLPSNL PQLQNL I KRDPPAY I EEFLQ " 
QYNHYKSNVE I FKLQPNKPSKELAELVMFMAQISHCYPE YLSNF 
PQEVKDLLSCiraTVLDPDLRMTFCKALILliRinCNLINPSSIiLEL 
FFBLFRCHDKLLRKXIiYTHI VTD I KNINAKH KNNKVNWLQNFW 
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ID 
NO: 


* *601 C Leu 

beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K» Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Va2ine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








YTMIiRDSNATAAKMSLDVMIELYRRNIWNDAKTVNVITTACPSK 
VTK I LVAALTFFLGKDEDEKQDSDS ESEDDGPTARDLLVQ YATG 
K KS> S kwkkkIjEKAMKvIjKKHRKKKKPEVFNFSAIKL ihd PQDFA 
EKLLKQLECCKERFBVKMMLMNLISRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKILLFAAQASHHLVPPEI IQSLLMTVANNFVTDK 
NSGE VMTVG I NAI KE I TARCPIiAMTE ELLQ D LAQ YKTHKD KNVM 
MSARTLIHLFRTLNPQMLQKKFRGKPTEAS I EARVQE YGELDAK 
DY I PGAE VLEVEKEENAENDEDGWES TSLS EEEDADGEWI DVQH 
SSDEEQQE ISKKLNSMPMEERKAKAAAISTSRVLTQEDFQKIRM 
AQMRKELDAAPGKSQKRKYIEIDSDEEPRGELLStiRDIERLHKK 
PKSDKETRIiATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 


1 


1531 


FRGDLCGQRGGSAPGEGGSSAWPAPAkPLPERERBREALCPGRS 
CSGGGGEETPGTTPVWSPLEGGGDEELRPNPYVRFPYRWWAWV 
LAAF P S L£ AGGETP E AP PES WTQL WF FR FWNAAGY AS FMVPG Y 
LLVQY FRRKNYLETGRGLC FPLVKACVFGNE PKASDEVPIAPRT 
EAAETTPMWQALKLLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TSPGERFTDSQFLVLMNRVLALIVAGJbSCVLCKQPRHGAPMYRY 
S FASL SNVLS S WCQYEALKFVS FPTQVLAKAS KVI PVMLMGKLV 
SRRSYEHWBYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAGYIAFOSFTSNWQDAIiFAYKMSSVQMMFGVNFFSCLFTVGSL 
LEQGALLEGTRFMGRHS EFAAHALLLS ICS AOGQLFI FYT IGQF 
G AAVFTI I MTLRQAFAI LLS CLLYGHTVTVVGGLGVAVVFAALL 
LRVYARGRLKQRGKKAVPVE3PVQKV 


5507 


3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLE IGGFGTAAGKK " 
VAVADVQFGPMRFHQDQLQVLLVFTKEDNQCNGFCRACEKAGFK 
CTVTKEAQAVLACFLDKHHDI 1 1 IDHRNPRQLDAEALCRS IRSS 
KJ.»S EN TVI VGWRRVDREELSVMPFISAGFTRR YVENPN I MACY 
NEIiLQX#EFGEVRSQIJCLRACNSVFTALENSBDAI2ITSEI)RFIQ 
YANPAFETTMGYQSGELIGKBLGBVPINEKKADLIiDTINSCIRI 
GKEWQG I YYAKKKNGDNIQQNVKI I PVIGQGGK IRHYVS I IRVC 
NGNNKAEKISBCVQSDTHTDNQTGKHKDRRKGSIiDVKAVASRAT 
EVSSQRRHSSMARIHSMTIEAPITKVINIINAAQESSPMPVTEA 
IiDRVLE ILRTTEL YSPQFGAKDDDPHANDLVGGIiT«3SDGIiRRLSG 
NEYVLS TKNTQMVS SNI ITP I SLDDVPPRIARAMENEEYWDFDI 
FELEAATHNRPLIYLGLKMFARFGICEFLHCSESTLRSWLQI IE 
ANYHS SNPYHKSTHSADVLRATAYFLSKERI KETJbDPIDB VAAL 
IAAT IHDVDHPGRTNS FLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGPDKCNIFKNMERNDYRTLRQGI IDMVLATEMTKHFEHVN 
KPVKSINKPLATLEENGETDKNQEVINTMLRTPENRTLIKRMLI 
KCADVSNPCRPLQ YCIEWAAR ISEEYFSQTDEE KQQGLPWMP V 
FDRNTCS I PKSQ IS FIDYFITDMFDAWDAFVDLPDLMQHLDNNF 


5508 


1151 




LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGLRGFPN" 
VLKKVLVDQLVASPLLGVWYFI^I^C^GQTVGESCQELREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRSPVPLTPPGCVALDTRAD 


5509 


1238 


619 


RKSRGC^NAI^ASGPAAAAAAIMVRKLKFHEQKIJLikQVD^LNttE 
VTDHNLHELRVLRRYR1^RREDYTRYNQLSRAVRELARRIjRDI»P 
ERDQFRVRASAALLDBCLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDWTDPAFI*VTRSM 
EDFVTWVDSSKIKRHVLEYNEERDDFDLEA 


5510 


96 


1195 


PAGAHLSSGSSEPLVEPGRGRVGARVKGERGLQASGSAPGRSKM 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKMKRSQAY 
ADYIGFIIjTLNEGVKGKKLTFEYRVSEAIEKLVALLNTLDRWID 
ETPPVDQPSRFGNKAYRTWYAKLDEEAENLVATVVPTHLAAAVP 
BVAVYLKES VGNS TR I D YGTGHEAAFAAFLCCLCKIG VLRVDDQ 
IAIVTKVFNRYLEVMRKI^KTYRMEPAGSOGVWGliDDFQFLPFr 
WGSS QLIDHP YLE PRHF VDE KAVNENHKD YMFLECI LF I TEMK7 
GPFAEHSNQXiWNI SAVPS WS KVNQGLIRMYKAECLEKFP VIQHF 
KFGSLLPIHPVTSG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


.w DcyncHv wutaauiug signal peptide 
(A»Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G*Glycine, 
HaHistidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Mechionxne, N«Asparagine, 
P=Proline, Q»Glutamine , R=Arginine, 
S-Serine, T-Threonine, V«Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=»stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


5511 


276 


1980 


KLHRVI.NLPPKlMilTSISAVPtsOkfiEVADPQLSVDSLLEKDND 
HSRPDIQVQAKRLAEKLRCDTWSEISTGQRTVNFKINRELLTK 
TVLQQVIEDGSKYGLKSELFSGLPQKKIWEFSSPNVAiCKFHVG 
HLRSTI IGNFlANLKEALGHQVIRINYLGDWrcMTiTOT ryw*™ 

FGyEEKLQSNPIK)HLFEVYVQVNKBAADDKSVAKAAQEFFQRIiE 
LGDVQALS LWQKFRDLS I EE Y X RVYKRLGVY FDE YSGES F YREK 
SQEVLKhhBSKGLLLKTIKGTAVVDLSGmDPSSICrvmSDGT 
SLYATRDLAAAIDRMDKYNFD^IYVTDKGQKKHPQQVPQMLKI 
MGYDWAERCQHVPFGWQGMKTRRGDVTFLEDVLNEIQLRMLQN 
MAS I KTTKELKNPQETAERVGLAALI IQDFKGLLLSDYKFSWDR 
VFQSRGDTGVFLOYTHARTiHST.PVTPT'mvT untniT^riT a Bt ,*„ 

VSILQHLLRFDEVLYKSSQDFQPRHIVSYLLTLSHLAAVAHKTL 
QIKDSPPEVAGARLHLFKAVRSVLANGMKLLGITPVCRM 


5512 


120 


1015 


DPSLIiLTITVTGVTVLVLVLKSM^SRRREPITI^DPEAKYPLPL 
IE KE KI SHNTRR FRFGLP SPDHVLGLPVGNYVQLLAKI DNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQYLENMK 
IGETIFFRGPRGRLFY1IGPGNLGIRPDQTSEPKKTLADHLGMIA 
**** "W-u * "-T- * i iwf&Uft i KMoLil FANQTEED I L VRKEIiE 
EIARTHPDQFDLWYTliDRPPIGWKYSSGFVTADMIKBHLPPPAK 
STLI LVCGPP PLIQTAAHPNLEKliGYTQDMI FTY 


5513 
5514 '■ 


2 


837 


ARWRLPSDSFKIPPAGAETPGRGSCRNYLPSSSPPPPE'PSSFPS 
PPTSRGGPGSRDIMSDSEEESQDRQLKIWLGDGASGKTSIjTTC 
FAQETFGKQYKQTIGIjDFFLRRITLPGNIJMVTLQIWDIGGQTIG 
GKMLDKYI YGAQGVLLVYDITNYQS FENLEDW YTWKKVSEESE 
TOP LVALVGNKIDLEHMRT T KPETfHT .t? srnTHrpccuin *v wk<^ 

D5VFLCFQKVAAEILGIKLNKAEIEQSQRVVKAD I VNYNQEPMS 
RTVNPPRSSMCAVQ 




1295 


449 


VNRPS WI MGM F RGHAI*PGT I FFF 1 IGLWWCTKS 1 LKYICKKQJKRT 
CYLGSKXLFYRLEILEGITIVGMALTGMAGEQFI PGCPHLMLYD 
YKQGHWNQLLG^raHFTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNALFVEAFI FYNHTHGREMLDIFVHQLLVLWFLTGLVAFL 
EFLVRNJJVLLELIiRSSIiILIKJGSWFFQIGFVLYPPSGGPAWDIjM 
DHENI LFLT I CFCWHYAVTI VIVGMNYAFITWIi VKSRLKRLCS S 
EVGIiLKNAEREQESEEEM 


5515 


1572 


260 


FVRLVGRGDCDPLliSVCl!iTTMPr J VEdT/3qfy'pt/T»a\nrT^T^»PTio — 
TKCGFAGETGPRC 1 1 PS V I KRAGMPKP VR VVQYN INTEELYS Yb 
KEFI HILYFRHLLVNPRDRRWI IBS VLCPSHFRETLTRVLFKY 
FEVPSVM^SHLMALLTLGINSAMVLDCGYRESLVLPIYBGIP 
VLNCWGALPLGaKALHKBLETQI^EQCTVDTSVAKEQSLPSVMG 
5 VPEG VLEDIKARTCF VSDLKRGLKIQAAKFNI DGNNERPS PP P 
MVDYPLDGEKIIiHILGSIRDSVVEILFEQDNEEQSVATLILDSL 
IQCPIDTRKQLAENLWIGGTSMLPGFLHRIiLAE I RYLVEKPKY 
KKALGTKTFRIHTPPAKANCVAWLGGAI FGALQDILGSRSVSKE 
YYNQTGRI PD WCSLNNP PLEMM FDVGKTQPPLMKRAFSTEK 


5516 


3 


735 


NSREPPOAGPGPSPRKSPTASSFIiFPWRPX«ASSFW«GAC?GAQES 
I KAMWRVPGTTRRPVTGBS PGMHRPEAMIiLLLTLALLGGPTWAG 
KMYG PGGGKYFSTTED YDHE1 IX3LRVS VGLLu VKS VQVKLGDS W 
DVKLGALGGNTQE VTLQPGE Y I TKVFVAFQAFLRGMVMYTS KDR 
YFYFGKIiDGQISSAYPSQEGQVLVGIYGQYQLLGIKSlGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR • 


5517 
5518 * 


246 


499 


S E I YVAMRTDSS KMTDVESG VANFASSARAGRRNALPD IQSSAA 
TDGTS DLPLKLEALS VKSDAKEKDEKTTQDQLEKPQWEEK 




3 


1375 

< 


DAWADAVIVRAWDLNMDFPCLWLGLLLPIjVAAIjDFNYHRQEGMEA ' 
FLKTVAQN YSS VTHLHS I GKSVKGRNLWVLWGR FPKEHRIG I P 
EFKWANMHGDETVGRELIiHLIDYLVTSDGKDPEITNLINSTR 
IH1 MPSMNPDGFEAVKKPDC YYS IGRENYNGYDLNRNFPDAFB Y 
NNVSRQPETVAVMKWLKTETFVLSANLKGGAIiVASYPFDNGVQA 
TGALYSRS LTPDDD VTQ YLAHT YAS RNPNMKKGDE CKNKMNF PN 
3 VTNGYS WYP£»QGGMQD YNYI WAQCFEITLELS CCKYPREEKLP 
S FWNNNKASLI EIYI KQ VHLGVKGQVFDQNGNPLPNVI VE VQDRK 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


" Predicted encT 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


x\uij.iiu awAu otsgmenw concaxmxig' sly Hal peptide 
(A= Alanine, C«Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidirie . IeIso3p\i£»inp K— Ti\r«s ■{ r»*=» 
L=Leucine, M=Methionine, N=Asparagine , 
P*Proline, Q=Glut amine, RsArginine, 
S^Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y=Tyroeine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«=possible nucleotide insertion) 








HICPYRTNKYGEYYLLLLPGSYIINVTVPGHDPHITKVIIPEKS " 
QN FSALKKDI LLP FQGQLDS I P VSNPSCPM I PL YRNLP DHS AAT 
KPSLFLFLVSLLHIFPK 


5S19 


87 


477 


I KS KLNOOVE VOE SEWRLTEAKG PTMGKRSGWDSGR JVAVlvavvfa' 
G VVAVGTVLVALS AMGFTS VG IAAS5 I AAKMMS TAAI ANGGGVA 
AGSLVAILQSVGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


5520 


117 


943 


SQEGKDEVKPKILANGARWKYMTLLNLLLQTIFYGVTCLDDVLK 
RTKGGKDI KFLTAFRDLLFTTLAFP VSTFVFLAFW I LFLYNRDL 
1 YPKVLDTVI PVWLNHAMHTFIFPITLAEWLRPHS YPS KKTGL 
TLLAAAS IAYI SRI LWLYFETGTWVYPVFAKLSLLGLAAFFSLS 

PAKHQLVKNIR 


5521 


546 


911 


KILNMQKSCEENEGKPQNMPKAEE'l&RPLEbVPQEAEGNPQPSEB 
GVSQEAEGNPRGGPNQPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5522 


1224 


63 7 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQOSTDHRGVPGKPGRWTLVEDPAGCVWGVAYRLPVGKEEEVK 
AYLDFREJttK?YRTTTV"IFYPKDPTTK?FSVIiLYIGTCDNPDYIiG 
PAPLBDIAEQ I FNAAGPSGRNTEYLFELANSIRNLVPBKADEHL 
FALE KLVKERLEGKQNLNC I 


5523 


3 


1280 


a Ma lUUtWja t> HbAM 1 AKR P VFDDKEDVNFDHFQ I LRAIGKGSFG 
KVCIVQKRDTEKMYAMKX14NKQQCIERDEVRNVFRELEILQEIE 
HVFLVNLWYS FQDEEDMFMWDLLIiGGDLRYHLQQNVQFSEDTV 
RLYICSMALALDYLRGQHIIHRDVKPDNILLDERGHAHLTDFNI 
ATI I KDGERATALSGTKPYMAPEI FHSFVNGGTGYSFE VDWWSV 
GVMAYELLRGWRP YDIHSSNAVES LVQLFSTVS VQ YVPTWS KEM 
VAliLRKLLTVNPEHRLSSLQDVQAAPAIiAGVLWDHLSEKRVEPG 
FVPNKGRLHCDPTFELEEMILESRPLHKKKKRLAKNKSRDNSRD 
S SQSENDYLQDCLDAI QQDFVI FNREKLKRS QDLPRE P L PAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


5524 


85 


2318 


RERERDHR PGES SQGQSGAGGCF PS P TMELRGGGLLFS SRFDSG - " 
NLAHVEKVBS LS SDGEG VGGGASALTSGI AS SPDYE PNVWTRPD 
CAETEFENGNRSWFYFSVRGGMPGKLIKINIMNtWKQSKLYSQG 
MAPFVRTLPTRPRWERIRDRPTFEMTETQFVLSFVHRFVBGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
EELLCYSLDGLRVDLLTITSCHGLREDREPRXiEQLFPDTSTPRPF 
RFAGKRIFFLSSRVHPGETPSSFVFNGFLDFILRPDDPRAQTLR 
RLFVFKLI PMLNPDGVVRGRYRTDSRGVNT.NPf) YT .tf PnRUT.nD a 

IYGAKAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDIjBKAN 
NLQNEAQOSHSADRHNAEAWKQTEPAEQKIiNSVWIMPQQSAGLE 
ESAPDTI PPKESGVAYYVDLHGHASKRGCFMYGNSFSDESTQVE 
NMLYPKLISLNSAHTOFXJGCNFSEKNMYARDRRDGQSKEGSGRV 
AIYKASGIIHSYTLECNYNTGRSVNSIPAACHDNGRASPPPPPA 
FPSRYTVELFEOVGRAMAIAALDMARCMPMPRTVT.QRUCjeT.'rTaT 

RAWWLKHVRNSRGLSSTLNVGVNKKRGLRTPPKSHNGLPVS CSE 
NTLS RARS FSTGTS AGGSS SSQQNSPQMXNSPS FP FHG SRPAGL 
PGIXSSSrQKVTHRVLGPVRGKPVWEPl^HV-FGCLGHCWGK 


5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTtVRESGS 
LTY^EFLGRVAELNDVTAKVASGQEKHLLFEVQPGSDSSAFWKV 
VVRWCTKINKSSGIVEASRIMNLY^FIQLYTOITSQAAGVLAQ 
SSTSEEPDENSSSVTSOQASLWMGRVKQLTDEEECCICMDGRAD 
LILPCAHSFCQKCIDKWSDRHRNCPICRLQMTGANESWWSDAP 
TEDDMANYILNMADEAGQPHRP 


5526 


3 


853 


RRPCN PVRAAKRTGAAARA PRGLE VTMLR VAWRTLS L lRTRAVf~~ 

QVLVPGLPGGGSAKFPFNQWGLQPRStLLOAARGYVVRKPAQSR 

XJDDDPPPSTLLKDYQNVPGIEKVDDWJOILLSLEMANKKEMLKI 

KQEQFMKKIVANPEDTRSLBARIIALSVKIRSYEBHLEKHRKDK 

AHKRYLI^SIDQRKJCMLKNLRNTNYDVFEKICWGIiGIEYTFPPL 

YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment contain J. tier sianai r\fnt i At* 
(A«Alanine, OCysteine, D=Aspartic Acid, B=» 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutaraine , ReArginine, 
S=Serine, T=Threonine, VcsValine, 
W=Tryptophan, Y=Tyrosine, X-Onknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 " " 


565 


LLR X YLl/HQfl PLLLRHQPNRtfCI S F^ATMKIiKDTKSRP KQSS CG 
KFQTKG I KWG KWKE VKI DPNMPADGQMDDLVCFEELTD YQLVS 
PAKNPSSLPSKEAPKRKAQAVSEEEEEEEGKSSSPKKKIKLKKS 
KNVATEGTS TQ KE PEVKD PELEAQGDDMVCDDPEAGEMTSENL V 
QTAP KKKKNKG KKGLE PSQSTAAKVPKKAKTW I PEVHDQKADVS 
AWKDLFVPRPVLRALSPLGFSAPTPIQALTLAPAIRDKLDILGA 
AETGSGKTLAFAIPMIHAVLQWQKRNAAPPPSNTEAPPGETRTE 
AGAKTRSPGKAEAESDALPDDTVI ES EALPSDIAAEARAKTGGT 
VSDQALLPGDDDAGEGPSSLIREKPVPKQNENEEENLDKEQTGN 
LKQELDD KS ATCKAYPKR PLLGLVLTPTRELAVQVKQHIDAVAR 
PTGlKTAIIiVGGMSTQKQQRMLNRRPEIWATPGRLWELlKEKH 
YHLRiJIjRQLRCLVVDEADRMVEKGHPAEriaOT.T.PMT.MnQnvra'Dv 

RQTLVFS ATLTLVHQAPARI LHKKHTIOCMDKTAKLDIiLMQKIGM 
RGKPKVIDLTRNEATVETLTETKIHCETDBKDFYLYYFLMQYPG 
RSLVFANS ISCI KRLSGLLKVLDIMPLTLHACMHQKQRLRNLBQ 
FARLEDCVLLATDVAARGLDI PKVQHVIHYQVPRTS E I YVHRSG 
RTARATNEGLSLPOLlGPEDVINFKKlYKTLKKDEDIPLFPVC/rK 
YMDWKERIRLARQIEKSEYRNFQACLHNSWI EQAAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQ3GKPPLLVS APSKSESAI*S CLS KQKKKXTKKPKEPQPEQP 
QPSTSAN 


5528 


3 


895 


GPFLSA.CRMWGACKVKVHDSLAT IS I TLKRYLRLGATMAKS KFE 
YVRD FEADDTCLAH CW WVRLDGRNFHR FAE KHNFAKPNDS RAL 
QLM7KCAQTVME ELED1 V I AYGQSDE YSF VF KRKTNWF KRRAS K 
FMTHVASQ FAS S YVF YWRD YFEDQPLLYPPG FDGR VWYPSNQT 
LKDYI^TOOADCKXNNLYNTVFWALXQQSGLTPVQAQGRI^GTr, 
AADKNEILFSEFNINYNNEPPMYRKGTVliIWQKVDEVMTKEIKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5529 


48 


640 


TFRLVSAHIJCTRKLINPEAAERRWRDWDSRC2GWI,SVkMQRVSGl7" 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVI I RFTPTVPHCSLATL 
IGI*CLRVKLQRCLPFKHKLB I YIS EGTHSTEE D INKQ INDKE RV 
AAAMENPNLREIVEQCVLEPD 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDIiKPENVVFFEKQGLVKIiTDFGFSNK 
FQPGKKLTTSCGS LAYSAPE ILLGDEYDAPAVD I WSLGVTLFML 
VCGQPPFQEANDSBTLTMIMDCKYTVPSHVSKBCKDLITRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
S 1 1 QRMVLGD IADRDAI VEAUSTNRYNHITAT YFLLAER I LREK 
QEKE IQTRS&SPSNIKAQFRQS WPTKI D VPQDL EDDLTATPLS H 
ATVPQS PARAADS VLNGHRS KGLCDS AKKDDLPELAGPALS TVP 
PASLKPTASGRKCtiFRVBEDBEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVLNQI FEEGESDDEFDMDENLPPKLSRLKMNri 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSBGPPGSEGDGGGQSKPSNASGGVDKASPSBNNAG 
GGS PSS GSGGN PTNTSGTTRRCAG P S NS MQLAS RS AGEL VES L K 
LMSLCLGSQLHGSTKYIIDPQNGLSFSSVKVQEKSTWKMCISST 
GNAGQVPAVGG I KFFSDHMADTTTELERIKSKNLKNNVLQLPLC 
EKTISVNIQRNPKEGLLCASSPASCCHVT 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGrvIiFARLF' ' 
ALEPDIj1iPLFQYNCRQPSSPEDCI.»SSPEFLDHIRKVMIiVIDAAV 
TNVEDLSSLEEYXASI^^KHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGB 


5532 


3395 


1402 


SDWMWGKRKM I IEDETEFCGEELLHSVLQCKSVFDVliDGEEMR" ' ' 

RARTRANPYEKIRGVFFIiNRAAMKMANMDFVFDRMFTNPRDSYG 

KPLVKDRBABLL YFAD VCAGPGGFS E YVLWRKKWHAKGFGMTLK 

GPNDFKIiEDFYSASSELFEPYYGEGGIDGDGDITRPBMISAFRN 

FVLDNTDRKGVHFIjaAIXaGFSVEGQENLQEILSKQLLLCQFIiMA 

LSIVRTGGHFICBCTFDLFTPFSVGLVYLLYCCFBRVCLFKPITS 

RPANSERYWCKGLKVGIDDVRD YLFAVN I KLNQ LRNTDSDVNL 
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ID 

NO: 


1 DrpH i rt~ &r\ 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaicuea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G^Glycine, 
H=Histidine, I-lsoleucine, K-Lysine, 
L=Leucine, M=Methionine, N«Aeparagine, 
P«Proline, Q»Glutamine, R^^Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan f Y=Tyrosine, X=Unknovra, *=Stop 
Codon, /^possible nucleotide deletion, 
\b^vdoxwic uuuicuLiuc inscrcionj 








WPLEVIKGDHEFTDYMIR5NESHCSLQIKALAKIHAFVQDTTIT 
SEPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKPPELIQGTE1 
DI PS YKPTLLTS KTLBKI RP VFDYRCMVSGS EQKFLIGIjGKSQ I 
YTWDGRQSDR WI KLDtiKTELPRDTLLSVE X VHELKGEGKAQRKI 
SAIH I LDVLVLNGTDVRBQHFNQRI QLAEKFVKAVSKPS RPDMN 
tri n v a.o v i tints antsi\.± r vk JjKMaI JL JGGSSGTPiCijSYTGRDDRHF 
VPMGLYIVRTVNEPWTMGPSKSPKKKFFYNKKTKDSTFDLPADS 
IAP FH I CYYGRL FWEWGDG I RVHDSQKPQDQDKLS KED VLS PIQ 
MHRA 


S533 


94 


789 


MKERRAPQPWARCKLVLVGDVQCGKTAMLQVLAKDCYPETYVP 
LVKLSHi I ALii£,i EQKVEL5LWDTSGS p yydnvrpi^ysdsdav 
LL CFD 1 5RP ETVDS ALKKWRTE I LD YC PSTRVLL I GCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
HSIFRTASMLCXJIKPSPLPQKSPVRSLSKRLLHLPSRSELISPT 
PKKEKAKXCS I M 


S534 


3 


605 


LVRGRARAANPGRVGftMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVBKR YLAAGAVTLLS LYLLFG YGAS LLCNJj IGFVYP AYAS I K 
MESPSKDDDTVWLTYVTVVYAIiFGIiAEFFSDLLLSWFPPYYVGK 
CAFLLFCMAPRPWNGALMLYQRWRPLPLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDSEARLCS LVELS DTQDETQKSDS ENEDLKI DCLQES QEL 
NLQKLKNSERI LTEAKQKMRELTVNI KMKEDLI KBtilKTGNDAK 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQELENKDLSDVAM 
KVKLQKE FRKKVDAAKLRVQVLQKKQQDS KKLAS I»S I QNE KRAN 
ELEQSVDHMKYQKIQLQRKLQEENEKRKQLDAVIKRDQQKIKVI 
LSYIPAXYNMKC 


5536 


942 


282 


AAATAAS LS PRGCRt«RT PSS DVS PS RA? PPSAAPLPTGRAQMS P 
SGRLCLLT 1 VGI*I LPTRGQTLKDTTS SS SADATI MD IQVPTRAP 
DAVYTELQPTSPTPTWFADETPQPQTOTQQLEGTDGPLVTDPET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGFHEDDPF 
FYDEHTLR KRGL L VAAVLF I TG 1 1 1 LTS GKCRQ. I>S RLCRNHCR 


5537 


3 

< 


2391 


RARVSS PQLRVFRSGRPRRLRVIiRINRTS VALRIiAGTGRFVAXT 
PGHPGS WEMGLLT FRDVAVEFS LEE WEKLEPAQKNL YQDVMLEN 
YRNLVSLGIiWSKPDLITFI*EQRKEPWNVKSEBTVAIQPDVPSH 
YNKDLLTEHCTBASFQKVISRRHGSCDLENLHLRKRWKREECEG 
HNGCYDEKTFKYDQFDESSVESLFHQQILSSCAKSYNFDQYRKV 
PTHSSLLNQQEEIDIWGKHHlYDKTSVliFROySTLNSYRNVFIG 
EKNYHCNNSEKTIiNQSSSPKNHQENYFIiEKQYKCKEFSEVFIiQS 
MHGQEKQEOS YKCNXCVEVCTQSUCH I QHQTIHIRENS YS YNKY 
DXDLSQSSNLRKQIIHNEEKPYKCEKCGDSLNHSLHLTQHQIIP 

TPPfrBVTfEJK'POr3in/I?XITJM^ , OT VT wnnn rrwrintrr — — __ — 

i x A.wA&UfKvirnLiNt.olix JbTKQQQIDTGENLYKCKACSKS 
FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYiTKHKRlHTO 
EKPYKCKECGKAFNRSSCLTQHQTTHTGEKLYKCKVCSKSYARS 
SNLIMHQRVHTGBKPYKCKECGKVFSRSSCLTQHRKIHTGBNLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKEOGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 
vvwtc o j o ou v lunaK j.m iuUKir X ivLlS£,u(jKAxTO iTtSYLTTHQR 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
FSYRSYLTTHRRSHSGBRPYKCEECGKAFNSRSYLIAHQRSHTR 
EKL 


5538 


926 


161 


HSMMMKIPWGSIPVLMLl^l^IjIDISQAQliSCTGPPAtPGiPG 
I PGTPGPDGQPGTPGI KGEKGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDQTIRFDHVI TNMNNNYEPRSGKFTCiCVPGLYYFTYHA 
SSRGNLCVNLMRGRERAQKVVTFCDYAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKNSLLGMEGANSIFSGFLI.FPDMEA 


5539 


38 


1258 


HRGPSGAAAPUC^PRGCJUiEGPRSCRRPQPMARRYDEI*PHYPG " 
I VDG PAAIiAS FPETVPAVPG P YGPHRPPQPLPPGLDS DGLKREK 
D12IYGHPL FPLLAL VFEKCELATCS PROGAGAGLGTP PGG D vcs 
SDSFNEDIAAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricuiLueu tsnu 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue nf 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E* 
Glutamic Acid, F« Phenyl alanine, GoGlycine, ! 
HaHistidine, I»Ieoleucine, K=Lysine, j 
L-Leucine, M=Methionine, N=Asparagine, I 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, TVThreonine, Vs Valine, 
W*Tryptophan, Y»Tyrosine, X=Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) | 








BLEKVHDLCDNFCHRYITCLKGKMPIDLVIBDRDGGCRBDFBDFn 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTS VAS PSSGGEDEDLDQERRRNKKRGI FPKVATNIM 
RAWLFQHLSHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL J 


5540 


148 


144 Q 


P P LGAG AG VHARS PHP ARRLP L'l'TAG VGGRAPDLLPTP WRQHRG 1 

PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 

GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKRBKDEI 

YGHPLFP LLALVFEKCELATCS PRDGAGAGLGTPPGGDVCSSDS 

FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVXjRFHLLELE 

KGKMPIDI,VIEDRDGGCRBDFEDYPASCPSLPOQNNrWIRDHED 

SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 

EDLDQE P RRNKKRG I FPKVATNIMRAWLFQHLSHPYPSEEQKXQ 

LAQDTGLTILQVNNWPINARRRIVQPMIDQSNRTGQGAAFSPEG 

Q PIGGYTETEPHVAFRAPASVGDEFGTR KEEWHYL | 


5541 


143 


1440 


P PICAGAGVHARS PHPARRLP LTT AGVGGRAPDIjLpt P W&QH RG * 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDBI 1 
YGHPLF PLLALVFEKCE LATCS PRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQWSERPLFSSNPELDNLMIOAIQVLRFHLLELE 
KGKMPIDIiVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQE P RRNKKRG I FPKVATN IMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGI/TILQVNNWFI NAKR RI VQPM2DQSNRTGQGAAFS PEG 
QPIGGYTET3PHVAFRAPASVGDEFGTRKEEWHYL I 


5542 


TTft 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDEIiPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDBI 
YGHPLFPLLALVFEKCELATCS PRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAF AKQVRSERPLFSSNPELDNLM X QAIQVLRFHLLELE 
KGKMPIDLVI EDRDGGCREDFED YPAS CPSLPDQNNI WIRDHBD ) 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
E DLDQE P RRNKKRG I FP KVATN I MRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLT I LQVNNWF INARRR1 VQPMIDQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL | 


5543 


2405 


665 


RVfVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDVJGKSPAP 
KRPFSDSGAFWS PERRPG VLEAPRRRPVPASFRAVPPKPTRVHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KBSRARRGPRGPSAFIPVEE^LREGAESLEQHLGLEALMSSGRV 
DNLAWMGLHPDYFTS FWR LHYLLLHTDGPLAS SWRHYI AI MAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKL3EINK 
LLAHRPWLITKEHIQALLKTGEHTWSLAELIQALVLLTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERKQQLQESLLRDEGTSQEEMESRFELEKSESLL 
VTPSAD ILEPS PHPDMLCFVEDPTFG YEDFTRRG AQAP PTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTXAMHSGV 
DTSVLRRAIWNYIHCVFGI RYDDYDYGEVNQLLERNIjKVYI KTV 
ACYPEKTTRRMYNLFWRHFRHSEKVIIVNLIiLLEARMQAALLYAL I 
RAITRYMT | 




1895 


514 


LGGLIXJRC2RI^I^GAGRLGAPMERHGRASATSVSSAGE0AAGD | 
PEGRRGEPLRRRAS S ASVPAVGAS AFGTTJR rH3T.n Q YQfTDTC VCD 1 

QRVESLRKKRPLFPWFGLDIGGTLVKLVYFEPKDITAEEBEEEV 
ESLKS IRKYLTSNVAYGSTGIRDVHLELKDLTLCGRKGNIiHFIR 
FPTHDMPAFIQMGRDKNFSSliHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLI KGILYI DS VGFNGRS QCYYFENPADSEKCQK 
LPFDLKNP YPLLLVNIGSGVS ILAVYS KDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRD1YGGDYBRFG 
L PGW A VAS S FGNMMS KEKREAVS KED LARATL I T I TNN I GS 1 AR 
MCTU^ENINCJWFVGN.FLRINTIAMRLLAYAIJSYWSKGQLKALF I 
SEHEGYFGAVGALLELLKIP 
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1 SEO — 

ID 

NO: 


beginning 

nucleotide 

location 

c o rre op on di n g 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 

nucleotide 

location 

co rre sponding 

to first 

amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G»G lycine, 
H«Hietidine, Iolsoleucine, K=Lysine, 
LaLeucine, M-Methionine, N^Asparagine, 
P=»Proline, Q=Glutamine, ReArginine, 
a»oenne f leinreoni.ne, v— vaxxne, 
WsTryptophan, Y=Tyrosine, X«Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLLGLLLALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRIiHSHDIKYGSGSOQQSVTGVEASDDANSY^^RIRG 
GSEGGCPRGSPVRCGQAVRLTHVLTGKNLHTHHFPSPLSNNQEV 
SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRF0HVGT3VPLSVT 
GEQ YGS P I RGQHEVHGM PS ANTHNTW KAMEG I P I K PS VE PSAGH 
DEL 


5546 


1592 


146 


fvprgghssmgqsgrsrhqkraraqaqlrnleayaanphsfvft"" 

RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHFLI LS KTETNVYFKLMRLPGGPTLTFQVXKYSLVRDVVS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNTIKRCLLIDYNPDSQELDFRHYSIKWPVGASRGMK 
KliLQE KFPNMS R LQD I S ELLATG AGLS ESEAEPDGDHNI TE LPQ 
AVAGRGNMRAQQSAVRLTE I GPRMTLQLI KVQEGVG EGKVM FHS 
FVS KTEEELQAI LBAKE KKLRLKAQRQ AQQAQNVQRKQ EQRB AH 
RKXSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQBDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 
CDQ KF PKTKDKSQG AQARRGPRGAS RDGGRGRGRGR PG KR VA 


5547 


1592 


146 


fvprgghsemgqsgrsrhqkraraqaqLrnleayaanpHsfVft 
rgctgrni rqls ldvrrvmepltasrlqvrkkns lkdcvavagp 

LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQ FAH P PLLVLNSFG PHGMHVKLMATMFQNLFPS I 
NVHKVNLNT I KRCLLID YNPDS QELDFRHY S I KWPVGASRGMK 
KIiLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITEIjPQ 
AVAGRGNMRAQQSAVRIiTEIGPRWTI^LIKVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKS LEGMKKAR VGGS D EEASG IPS RTA55 LELG E DDDEQEDDD I 
EYFCQAVGE APSEDLFP EAKQKRLAKS PGRKRKRWEMDRGRGRL 
CDQKFPKTKDKS QGAQARRG PRGAS RDGGRGRGRGRPGKRVA 


5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPLARALRGNETTA 
DSNETTTTSGP PDPGAS QPLLAWLIiLPIiLLLLLVLLLAAYF FRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEE XRI RS ADDCKQFREEFNSLP SGH IQGTFELANKE EN 
REKNRYPNILPNDHSRV1LSQLDGIPCSDYINASYIDGY1CEKNK 
FIAAQGPKQBTVNDFWRMVWEQKSATIVMLTNLKERKEEKCHQY 
MPDQGCWTYGNIRVCVEDCVVLVDYTIRKFCIQPQLPDGCKAPR 
LVSQLHFTSWPDFGVPFT P I GMLKFLKKVKTLNP VHAGP I WHC 
S AGVGRTX3TFI VIDAMMAMMHAEQICVDVFEFVSR IRNQRPQMVQ 
TDMQYTF I YQALLEYYLYGDTELDVSSIjEKHIiQTMHGTTTHFDK 
IGLEEEFRKLTNVRIMKENMRTGNLPANMKKARVIQIIPYDFNR 
VILSWKRGQEYTDYINASFIDGYRQKDYFIATQGPLAHTVKDFW 

KNDTL S E AI S I RDFL VTLNQ PQARQE EQ VR WRQ FH FHGWP E I G 
I PAEGKGM IDL I AAVQKQQQ QTGNHP I TVHCSAG AGRTGT F I AL 
SNILERVKAEGLLDVFQAVKSLRLQRPHMVQTLEQYBFCYKVVQ 
DFIDI FSDYANFK 


5549 


915 


256 


FEATGGKRLAFKMAGTARHDREMAIQAKKKIjTTATDPIERLRIiQ 
C LARGS AG I KGLGRVFR IMDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTI DFNEFLLTLRPPMSRARKEVIMQAF 
RKLDKTGDGV ITI EDLRE VYNAKHHPKYQNGE WS EEQVFRKFLD 
HeUoir lUttlAiLiv J. ran rmri IX AG VSAS I DTDVYF I IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLS LVKELDAFP KVPBS YVETSASGGTV 
SLIAFTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
TVAKKCQ Y VGADVLDLAETMVASADGLWEPTVFDriS PQQKEWQ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYVNKVAGNFWITVGXAIPHPRGHAHLAALVNHESYN 
FSHRIDHLSFGBLVPAIINPLDGTEKIAIDHNQMFQYF1TWPT 
KLHTYK I SADTHQFS VTERBRI INHAAGSHGVSGIFMKYDLSSL 
MVTVTEEHMPFWQFFVRLCGIVGGIPSTTGMLHGIGKFIVEIIC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 


211 


1760 1 


MQRDHTMDYKESCPSVSIPSSt>Ei«^KKXRF , l^kVLVSVGR , SE" > 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal neo'tlrfp 
(A=Alanine, C=Cysteine, D=Aspartic Acid, S= 
Glutamic Acid, F« phenylalanine, GsGlycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P* Proline, Q=Glutamine, RoArginine, 
S^Serine, T«Threonine, V*Valine, 
W«Tryptophan, Y-Tyxosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, ~ 1 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQyPAMAtktPAKRIFG^bNFDPbFIK 
QR RAGLNE FIQNLVR YPELYNHPDVRAFI*QMDS PKHQSDPS EDE 

DERSSQ KLHSTSQN INI/3PSGNPHAKPTDPDFLKVIGKGS PGKV 
LLAKRKLDGKFYAVKVLQKKIVLOTKEQ^ 

PFLVGLHYSF^TTEKLYFVLDFVNGGELFPHLQRERSFPEHRAR 
FYAAEIASALGYLHSIKIVYRDLKPENILLDSVGHVVLTDFGLC 
KEGIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 
EMLYGLP PF YCRDVAEMYDNI LHKPLSLR PGVSLTAWS ILEELL 
EKDRQNRiiGAKEDFLEIQNHPFFESIiSWADLVQKKlPPPFNPNV 
AGPDD I RN FDTAFTE ET VP YS VCVSSDYS I VNAS VLEADDAFVG 
FSYAPPSEDLFL 




2748 


930 


LGPAAGAAWGKKhKKHKAEWRSSYEDYADKPLEKPLKLVLKVGG/ 
S E VTELSGSGHDS S YYDDRSDHERERHKEKXKKKKKKSE KE KH h 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
P VRACRTQPAENE ST P I QQLLEHFLRQLQRKDPHG FFAFPVTDA 
I APGYSMI I KHPMDFGTMKDKI VANEYKSVTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKI LHAGFKMMSKQAALLGNEDTAVEEPVP 
EWPVQVErAKKSKKPSREVISCMFEPEGNACSLTDSTABEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLLYSWNTAEP 
DADEEETH P VDLS S LS S KLLPGFTTIjG FKDERRNKVTFLSSATT 

alsmqnns vfgdlksde mellys aygdetg vqcals iiqe fvkda 
gsyskkwddli^itcgdhsrtlfxjlkorrnvpmkppdeakvg 

DTLGDSSS SVLEFMSMKS YPDVS VD I SMLS S LGKVKKELDPDDS 

hlnldettkllqdlheaqabrggsrpssnlsslsnaserdqhhl 
gspsrlsvgeqpdvthdpyeflqs pepaasakt 


5553 
1 5554 


74 ■ ~ 


1095 


i^reavylvsrmix?pvaehakqepfhwtplleswalsovagwp| 
vflkcenvqpsgsfkirgighfcqemakkgcrhlvcssggnagi 
aaa yaarklg i pati vlpes tslqwqrlqgegae vqltgkvwd 
eanlraqelakrdgwbnvppfdhpliwkghaslvqelkavlrtp 

FGALVIiAVGGGG LLAG WAGLLEVGWQHVP I IAMBTHGAHCFNA 
A ITAGKLVTLPD1 TS VAKS IX3AKT VAARALECMQVCKlHS EVVE 
DTEAVS AVQQLLDDERMLVEPACGAALAAI YSGLLRRLQAEGCL 
PPSLTSVWIVCGGNNINSRELQALKTHLGQV j 




166 


2318 


CSGRTGGRG5 LRPABNV CLTCK LS GAETRGLLC PALRTWIMK VL 
GRS FF WVLF PVL PWAVQAVEHEEVAQRVI KLHRG RGVAAMQS RQ 
WVRDS CRKLSGLLRQKNAVLNKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFEI FQKRLNESENSVFO^VYGLQRALQGDYKDVVNMKESSR 
QRLE ALREAAI KEETE YMELLAAE KHQVEALKNMQHQNQSLSML 
DEILEDVRKAADRLEEBIEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANSKQWITKREVEDDLGLSMLIDSQNNQYILTKPRDSTIPRADH 
HF I KD I VTIGMIiS L PCGWIiCTAIGLPTMFGYI I CGVLLGPSGLN 
SIKSIVQVETLGEFGVFFTLFLVGLEFSPEKLRJCVWfCISIiOGPC 
YMTLLM I AFGLLWGHLIjRI KPTQSVFISTCLS LSS TPLVSRFLM 
GSARGDKEGDIDYSTVLLGfOLVT^DVQLGLFMAVMPrLZQAGAS 
ASSSIVVEVLRILVLIG0ILFSLAAVFLLCLVIKKYL1GPYYRK 
LHME S KGMKE I L I LG I SAFI FIuMLTVTELLD VSMELGCFLAGAL 
VSSQG P WTEE I ATS IEP I RDFLAI VFFAS IGLHVFPTFVAYEL 
TVLVFLTLSVVVMKFI^AALVLSl,ILPRSSQYIKWIVSAGLAOV 
S3FSFVLGSRARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 
TRCVPRPERRSSL 


555S 
5556 


212 


1425 
3346 


ijslrtretpapprceaasqgrvgWradaaaeeavrsvwnrtrdr! 
gtmap<jwi»stfcli»lii yl iga viagrjd fykilqvprsas ikd i k 
kayrklalqlhpdrnpddpqaqekfqdlgaayevlsdsekrkqy 

DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMFGGTPROQDRNIPR 
GSDI I VDLE VTLEBVYAGWFVEVVRNKPVARQAPG KRKCNCRQE 
MRTTQLGPGRFQMTQEVGCDECPPJVKL VNEBR TLB VE IEPG VRD 
GMEYPFIGEGEPHVDGEPGDLRFRIKVVKHPIFERRGDDLYTNV 
TX SI» VESL VGFEMDITHLDGHKVHISRDKI TRPGAKLWKKGEGIi 

PWFDNNNIKGStllTFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
VYXGLQGY 

RTRGMSKNCVPMEFEE YLLRM FQGTFYLLQKI TKDNNAHTVKSR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co r re sponding 
to first 
amino acid 
reaidue of 
amino acid 
sequence 


ocymcuL coiiLainiiiy signal peptide 
<A=Alanine, OCysteine, D*Aspartic Acid, B« 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, Idsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P« Proline, Q*Glut amine, R=Arginine, 
S«Serine, lVThreonine , VeValine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








IiEELDESYIEK^DFIJiLFVSVHLkftlESVSO^PWEF^lil^r" 
YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLNRYE 
DALVLLLTEVLNRIQFRYNQAQLEELDDETLDDDQQTEWQRYIiR 
QSLEWAKVMBLLPTHAFSTLPPVLQDNLEVYLGIiQQFIVTSGS 
GHRI^ITAE^CRRLHCSLRDLSSLL(3AVGRLAEYFIGDVFAAR 
FNDALTWERL VKVTL YGSQ I KLYN IETAVP S VLKPDL IDVHAQ 
SIAALQAYSHWLAQYCSEVHRQNTQQFVTLI STTMDAI TPLIST 
KV QDKLLLSACHLLVS LATTVRPVFLIS I PAVQKVFNR ITDASA 
jjttij v ujsjvj v li v i*K/Uj&N l LiS-tLt e W PNLjPENJEQQWP VRS JNKASL I 
SALSRDYRNLKPSAVAPQRKMPLDDTKLIIHQTLSVLEDIVENI 
SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDEMLSFFL 
TliFRGLRVQMGVPFTEQ I IQTFLNMFTREQLAES ILHEGSTGCR 
WEKFLKI LQWVQEPGQVFKPFLPSI IALCMEQVYP 1 1 AERPS 
PDVKAELFELLFRTLHHNWRYFFKSTVLASVORGIAEBQMENEP 
QFSAIMQAFGQSFLQPDIHLFKQNLFYLETLNTKQKLYHKKIFR 
TAMLFQFVNVLLQVLVHKSHDLLQEEIG I AI YNMAS VDFDGFFA 
AFLPEFLTS CDGVDANQKS VLGRNFKMDRVRRBRC3RAKRRAEWA 
RKPGTCAARRGHIEASGRGLCPPCSLAAAHEMPADLVL 


5557 


1712 


491 


v a u^iiv^LiKUt^unwi fVYOiaVKKliRLSAIiAGAGRFCILGSEAATR 
KHLPARNHCGLSDSSPQLWPEPDFRNPPRXASKASLDFKRYVTD 
RRLAETLAQ I YLGKPSRP PHLLLECNPGPG I LTQALLEAGAKW 
ALESDKTFI PHLESLGKNLDGKLRVIHCDFFKLDPRSGGVI KPP 
AMSSROLFKNLGIEAVPWTADIPLKVVGMFPSRGEKRALWKIAY 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACEIKVLHMEPWSSFDIYTRKGPI4ENPKRRBI1LDQLQQKLY 
LI QHX PRQNLFTRNLTPMNYNI FFHLLKHCFGRRSATVI DHLRS 
IiTPLDARDILMQIQKQEDEKVVNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


555S 


1509 


96 


RAGCTHPQ VPADLGAPAE PRR PQKTCVCLiiQPGp6G0RG PTTMI" 

LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTI,KGGMFAGQLTKVGMQQMFA 
LGERLRKNYVE DIPFLS PTFNPQE VP 1 RS TN I FRNLESTRCLIjA 
GLFQCQKEGPI I IHTDEADSEVIiYPNYQSCWSLRQRTRGRRQTA 
SLQPGISEDLKKVKDRMGIDSSDKVDFFIIjLDNVAAEQAHNLPS 
CPMLKRFARMIEQRAVDTSLY 1 I*P KEDRES LQMAVGP FIjH I LES 

nllkai^satapdkirklylyaahdvtfipllmtlgifdhkwpp 
favdltmelyqhleskewfvqlyyhgkeqvprgcpdglcpldmf 
lnamsvytlspekyhalcsqtqvmbvgnee 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPD1DSLLETLSPEEMEELBK 
ELDVVDPDGSVPVGLRQRNQTEKQSTGVYNREAMIiNFCEKETfac 
LMQREMSMDE S KQVE TKTDAKNGE ERGRDAS KKALG PRRDSDLG 
KEPKRGGLKKSFSRDRDEAGGKSGEKPKEEKI I RG IDKGRVRAA 

MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVKKNBPLHEKEAKDDS KTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPS I FDEPLERVKNNDPEMTSVNVNNSDC 
ITNE I LVR FTE ALEFNTWKLFAI1ANTRADDHVAFAI AIMLKAN 
KTITSLNLDSNHITGKGILAIFRALLQNNTLTELRFHNQRHICG 
G KTEME IAKLLKENTTLLKJjG YHFELAG PRMTVTNLLS RNMDKQ 
ROKRLOEORQAOEAKGERKDLLE VP KAGAVA.KGS PV PQ vn 15 c dw 

PSPKNSPKKGGAPAAPPPPPPPLAPPLIMENLKNSLSPATQRKM 
GDKVLPAQEKNSRDQLLAAIRSSNLKQLKKVEVPKLLQ 


5560 


9 


921 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEGFIjSAEECVAM 
QQRIGE I VAEMDVPLHCRTEFSTQEEEQLRAQGS TDYFLSSGDK 
IRFFFEKGVFDEKG^LVPPEKSINKIGHALHAHDPVFKSITHS 
FKVQTLARSIjGLQMPVWQSMYIFKQPHFGGEVSPHQDASFLYT 
EPIX3RVLGOTIAVEDATI,ENGCLWFIPGSHTSGVSRRMVRAPVG 

sapgtsflgsepardnslfvptpvqrgalvlihgewhkskqnl 

SDRSRQAYTFHLMEASGTTW3 PENWLQPTAELP F PQLYT 


5561 


2175 ' 


1775 


CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSP^QPPPQ 
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SBQ 
ID 
NO: 


| Predicted ~" 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, o* Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, F= Phenylalanine, G^Glycine, 

H»Hifi tiding TaTonloD^nA V— T.xro 4 r\*± 
*f«^ov>AU>me| A "ii»OieUCluc, i\ ~ Lfyo X il 6 , 

L-Leucine, Methionine, N»Asparagine , 
P« Proline, Q-K31 ut amine , R=Arginine, 
--Serine, T^Threonine , V-Valine, 
W«Tryptophan, Y=Tyrosine, XwUnknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion} 








QLIiAPTYFSAPGVKNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
QVYGGVTY YNPAOOOVO PKPS PP RRT Pf) PVT" 7 * K"PPDT>PvxrQor»e 

s 


5562 


342 


1385 


SSGKNDMAAAGAAGLVRGI,KAGVLSQADYI^VQCBTIiEDLkl^ 
I^STDYGWFLANEASPLWSVIDDRLKEKMWEF^HMRNHAYEP 
LAS FLD FITYS YM I DNVI LLITGTLHQRSI AELVPKCHPLGS FE 
QMEAWIAQTPAELYNAILVDTPIAAFFQDCISBQDLDEMNIEI 
IRNTLYKAYLES FYKFCTLLGGTTADAMCP I LE FEADRRAF I IT 
INSFGTEXiS KEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 

GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5553 


342 


1385 


SSGKNDMAAAGAAGLVR^LKAGVLSQADytNLVQCETL'&DLKUH 

LQSTDYGNFLANEASPLTVS VIDDRLKEKM WE FRHMRNHAYEP 

LASFLDFITYSYMlDNVILLITGTLHQRSIABtVPKaiPLGSFE 

QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDEMNIEI 

IRNTLYKAYIiESFYKFCTLLGGTTADAMCPILEFEADRRAFI IT 

i«w-.C\? l ai^KW?RAKLFPHCX;RLYPEGIjAQLARA^ 

DYYPE Y KLLFEGAGSNPGDKTLEDR F FEHEVKLNKLAFLNQ FHF 

GVFYAFVKLKEQECRNIVMIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


aVRRDKPAVWTARGRaROSDS^GGWMAOVGAWRTGAIXSI^I^ 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGIjC 

vpltwrcdrdldcsdgsdeeecriepctqkgqcppppglpcpct 

GVSDCSGGTDKKLRNCSRlACJw^ELRCTLSDDCIPLTWRCDGH 

Dn^PnOCnPT ^ ff^*hfcTPT T nn/ir. „ .fTL..Lfj-lrLrn iw - -- - - - 

vu w S CGTNE I L PEGDATTMG P P VTLE S VTS LRKATTM 
GPPVTliESVPSVGNATSSSAGDQSGSPTAYGVIAAAAVLSASLV 
TATLLLLS WLRAQ ERLR PLGLLVAMKESIiLLSEQKTSLP 


'5565 


993 


138 


RWNSPNPARAGSIwRPQRAPGSVSAVAMTAAVFFGCAFIAFGPA 

LALYVFT I ATE PLRI IFLIAGAFFWLVSL LI SSLVWF>5AR VI ID 

NKDGPTQKYLLIFGAFVSVYIQEMFRFAYYiaLKKASEGLKSIN 

PGETAPSMRLLAYVSGLGFGIMSGVFSFVNTLSDSLGPGTVGIH 

uuot^rr ij*oAtTirijVIII»LHVFWGIVFFDGCEKXKWGIL^ 

LTHLIiVSAQTFISSYYGINLASAFlII>VIiMGTWAFLAAGGSCRS 

LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAP VKM VS WM I SRAW&VFGMLYPAYYS YKAVKT " 

KNVKEYVR^MYWIVFALYTVIETVADQTVAWFPLYYELKIAFV 

IWLLSPYTKGASLIYRKFLHPLLSSKEREIDDYIVQAKERGYET 

MVNFGRQGLNLAATAAVTAAVXSQGAI TERLRSFSMHDLTTI QG 

DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 

EGPYS DNEMLTHKG PRRSQSMKSVKTTXGRKEVR YGSLKYKVKX 

RPQVYF 


5567 


1554 


233 


CDSECWTPLHAAATOSHLHLVELLIASGANIiLAVNTDGNMPYDL 
CDDEQTIiDCLETAMADRG ITQDS I EAARAVPELRMLDDIRSRLQ 
AGADLHAPLDHGATLLHVAAANGFSEAAALLLEHRASLSAKDQD 
G W E PLHAAAYWGQ VP LVELL VAHGADLNAKS LMDETPLDVCGDE 
EVRAXLIiE LKHKHDALLRAQS RQRSLLRRRTSSAGSRGKWRRV 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTGAELR 
P P P PEEDNPEWRPHNGRVGGSPVRHLYS KRLDRSVS YQLS PLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPBTAEP 
GLPGDTVTPQPDCX3FRAGGDPPIiLKLTAPAVEAPVERRPCCLLM 


5568 
5569 ( 


1731 
2 


587 

835 ( 


AEDRQPASRRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL " 
SLIiVSGPRLFLLQQPLAPSGLTLKSEALRNWQVYRLVTYI FVYE 
NP I S LLCGA 1 1 1 WR FAGJJFERTVGTVRHCFFTVIFAI FS AI I FL 
S FEAVSSLS KLGE VE DARGFTP VAFAMLGVT T VRSRMRRALVFG 
MVVPSVLVPWLLLGASWLIPQTSFIiSNVCGLSIGLAYGLTYCYS 
IDLSERVALKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGSYPTQSCHPHIiSPSHPVSQTQHASwQKLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VWSPGTVYSGALGTPGAAGSKESSRVPMP 

QTPCPLAWERGSRSEDIS^QKPPTCSSFSGMbVGPwStPriL^ 
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SEQ * 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaicceQ ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, CoCysteine, D=Aepartic Acid, E- 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H«Histidine, I-Isoleucine, K-Lysine, 
L=»Leucine, M*Methionine, N=Asparagine, 
PoProline, Q=Glut amine, R*Arginine, 
« wci , i = 4 iiircsoij i ne , v = vaixJie, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








LKIiLLt^LLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGIiPGP 
KGEPGIPAIPGIRGPKGQKGEPGLPGHPGKNGPMGPPGMPGVPG 
rrju j. rvvcruiaisiffK X KUKr yi> V FTVTRQTHQPPAPNSLIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVIiYRSGVK 
WT FCGHTS KTNQ VNSGGVLL RI^VG EEVW LA VNDYYDMVG I QG 
SDS VPSGFLIiF PD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPSPGKRRMDTDVVKLlESKHBVTIIiGGliNEFVVKPYGPQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFWNKI FHPN IDEASGTVCL 
DVINQTWTALYDLTNIFESFLPQIiLAYPNPIDPLNGDAAAMYIiH 
RPEBYKQKIKBYIQKYATBEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPSPGKRRMDTDWKLIESKHEVTILGGLNEFWKFYGPQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPNIDEASGTVCL 
DVIKQTWTALYDLTNIFESFLPQLIiAYPNPIDPLNGDAAAMYLH 
RPEBYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2602 


2085 


RTDYRTGI PGRRFRVMAAGIX3DVKLGTLGSGSESSNDGGSESPG 
DAGAAAEGGG WAAAALALLTGGG E M LLNVALVALVLLGAYRL WV 
R WGR RGLGAGAGAGEES PATS LPRMKKRDFShEQLRQYDQSRNP 
RILLAVNGKVFDVTKGS KFYG PAGP YGI FAGRDASRGLATFCLD 
KDALRDE YDDLS DLNAVQME S VREWEMQ FKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKOD 


"^573 


' 25^2 


219 


VP ARTPNAEDQGP EARAATAT PCQSGflR^ RAGEAAEIX3 V KMAAF 
S EMGVMPE XAQAVEEMDWLL PTD 1QAES IPLILGGGDVLMAAET 
GSGKTGAFS IPVIQI VYETLKDQQBGK3CGKTTIKTGASVLNKWQ 
MNPYDRGSAFAIGSDGLCCQSREVKEWHGCRATKGLMKGKHYYE 
VSCHDQGLCR VG WS TMQAS LDLGTDKFGPGFGGTGKKSHNKQFD 
N YGEE FTMHDT IGC YLD I DKGHVKFS KNGKDLGLAFEI PPHMKN 
QALFPACVLKNAELKFNFGEEEFKPPPKDGFVALSKAPDGYIVK 
SQHSGNAQVTQTKFLPNAPKALI VEPSR BLAEQTLNNI KQFKK Y 
IDNPKLRELLI I GGVAARDQLS VLENGVD I WGTPGRLDDLVS T 
GKLNLSQVRFL VLDEADGLLSQG YSDFINRMHNQ I PQVTS DGKR 
LQVI VCS ATLH8 FDVKKLSE KIMHPPTWVDLKGEDS VPl^rVHHV 
WPVNPKTDRLWERLGKSHIRTDDVHAKDNTRPGANSPEMWSEA 
I KILKGE YAVRA I KEHKMDQA I I FCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 
K\a±U±»&aVv xVirnnTiPDEKQNYVHRIGRVGRAERMGLAI SLVA 
T EKE KVW YHVCSSRGKGCYinTUiKBDGGCriWYNEMQLLSE IEE 
HLNCTISQVEPD I KVPVDE FDGKVTYGQKRAAGGGS YKGHVDI L 
APTVOEIiAALEKEAf)T £ 3T?T.HT/2VT.T3XiAT.wo'TT7 


5574" 


1731 


952 


NEGLEVFKEQELQPEDKGAVPEDASTERSAMASLGLQliVGYlLG 
LLG LLG T LVAMLL P S WKTS S YVGAS IVTAVG FS KGLWME CATHS 
TGITQCDIYSTLLGI/PADIQAAQAMWVTSSAISSIACriSVVGM 
RCTVFCQESRAKDRVAVAGGVFFI LGGLLGFIPVAWNLHG I LRD 
FYSPLVPDSMKFEIGEALYLGIISSLFSLIAGIILCFSCSCQRN 
RSNYYDAYQAQPLATRSSPRPGQPPKVKSEFNSYSLTGYV 


5575 "■ 


456 


766 


LLWALPCPPPTAAAVLLSSTGLMELLBKMLALTLAKADSPRTAIi 
LCSAWLLTASFSAQQHKGSLQKDPLLSQACVGCLEALLDYLDAR 
SPDIGRNSPHYLMFP 


5576 


249 


2146" 


RSWGAP W FWRMRLLRRRtiMP LRL3^^dA~FVLFLFLLkRb VS S> R 
ESATEKPWLKSLVSRKDHVLDLMLEAMNNLRDSMPKLQIRA^ 
QQTLFS INQSCLPGFYTPAELKPFWERPPQDPKAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DQKFRRCPPLATTSVI IVFHNEAWSTLLRTVYS VLHTTPAILLK 
EI ILVDDASTEBHLKEKLEQYVKQLQWRWRQBERKGLITARL 
LGASVAQAE VLT FLDAHCE C FHGWLE PLLARI ABDKTVWS PDI 
VTIDLN7FEFAKPVQRGRVHSRGNFDWSLTFGWBTLPPHEKQRR 
KDETYP I KS PTFAGG LFS ISKSY FEH I GT YDNQMB I WGGENVEM 
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ID 

NO: 


ir ireaicueci 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
[A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
PaProline, Q^Glutamine, R=Arginine, 

W»Tryptophan, Y^Tyrosine, X=Vnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLBIIPCSWGHVFRTKSPHTFPKtiTSVIARNQVR 
LAEVWMDS YKK I FYRRNLQAAKMAQEKS FGDI S E RLQLRBQLHC 
HNFS WYLHNVY PEMFVPDLTPTF YGAI XNLGTNQCLDVGBNNRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCLH7SKGALG 
LGSCHFTGKNSQVPKDEEWELAQDQIiIRMSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 




KNS uv;s kAj&i. a VWUiii 1 W VLFI IjDJjKVBS SMFCP I1KX1 1 LLP VLLD 
YSLSLNDLNVSPPELTVHVGDSALMGCVFQSTEDXCIFKIDWTL 
S PGEHAKDE Y VL YYYSNLS VP IGRFQNR VHLMGDI LCNDGS LLL 
QDVQEADQGTYX CE IRLKGESQVFKKAWLHVL PEEPKELMVHV 
GGL IQMG CVFQSTE VKHVTKVEW I FSGRRAKEEI VFRYYHKLRM 
SVE YSQSWGHFQNRVNLVGDI FRNDGS IMLQGVRESDGGNYTCS 
IHLGNLVFKKTIVIiHVSPEEPRTLVTPAALRPLViiGGNQLVIIV 
GIVCAT ILLLPVL1 LIVKKTCGNKSSVNSTVLVKNTKKTNPEI K 
EKPCHPERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMKPV 
WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5S?8 


3 


783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD 
WFGDPSS FRALLEPELRPEDRILVLGCGNSALSYELFLGGFPNV 
TS VDYSS VVVAAMQARYAHVPQLRWETMDVRKLDFP SAS FDWL 
E KGTL0ALLAGERDPWTVSSEGVHTVDQVLSE VSR VLVPGGRFI 
SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LSVAQLALGAQILSPPRPPTSPCFLQDSDHEDFLSAIQL 


5579 


3 


" 1540 


RNSGLARGASAX^HGGGIAGGVGTOCGACA^RCOGVMEGLLtR 
CRALPAI1ATCSRQLSGYVPCRFHHCAPRRGRRLI1LSRVFQPQNL 
REDRVLS LQDKSDDLTCKSQRLMLQVGL I YPAS PGCYHLLPYTV 
RAME KLVRVI DQEMQAIGGQKVNMPSLSPAELWQATWRWDLMGK 
ELLRLRDRHGKSYCLGPTHEEAITALIASQKXLSYKQLPFUiYQ 
VTRKFRDEPRPRFGLLRGREFYMKDMYTFDSSPEAAQQTYSLVC 
DAYCSLFNKLGI1PFVKVQADVGTIGGTVSHEFQLPVDIGEDRLA 
X C PRCSFS ANMETLDLSQMNCPACQGPLTKTKG I EVGHTF YLGT 
KYSS IFNAQFTNVCXSKPTIAEMGCYGLGVTRrLAAAIEVLSTED 
CVRWPSLLAP YQACLI PPKKGSKEQAASEL IGQLYDHI TEAVPQ 
LHGEVLLDDRTHLTIGNRLKDANKFG YP FVI I AGKRALEDPAH F 
EVWCQNTGHVAFLTKDGVMDLLTPVQTV 


5580 


i ft i 


450 


AUAG1 RCI PGFVVPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWC5V5S 
G PS R YVLGMQEL FRGHS KTREFLAHSAKVKS VAHSCDGRRLASG 
SFDKTASVFLLBKDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
ASGDKTIRI WDVRTTKCIATVNTKGBNINI CWSPDGOTIAVGNK 
DDWTF I DAKTHRSKAEEQFKFEVNE I S WNNDNNMFFLTNGNGC 
INI LS YPELKPVQS INAHPSNCIC I KFDPMGKYFATGS ADALVS 

EVETGDKLWEVQCESPTFTVAWHPKRPLLAFACDDKDGKYDSSR 
EAGTVKLFGLPHDS 


5581 


54 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CS PD P Q S STMN P VY S P VQPGAP YGNP KNMAYTG YPTAY P AAA PA 
YNPSLYPTNS PS YAPEFQFLHSAYATLLMKGAWPQNSSS CGTEG 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPNP YQTAM YP TRS A YPQQNL YAQGAYYTQ PVYAAQPH VIHH 
TTWQPNSIPSAIYPAPVAAPRTNGVAMGMVAGTTMAMSAGTLL 
TTPQHTAIGAHP VS MPT YRAQGTPAYS YVPPHW 


5562 


5775 


2739 


IITNNNNVIIPLVIAYHLSGSAQARGERSPAERLMERQKRKADI " 

EKGIiQ FIQSTLPLKQEE YEAFLLKLVQNLFAEGNDLFREKDYKQ 

ALVQYMEGlWADYAASDQVALPRELLCKLHVin^CYFTMGLY 

EKALEDSEKALGLDSESIRALFRKARALNELGRHKEAYECSSRC 

SXJVLPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 

TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPAIiLPSTPTMPLF 

PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSL 

VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 

ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino a rid seormftnt ermfaininn' e^rmai nei -k J j A 

(A=Alanine, C*Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G^Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S*Serine, T«Threonine, V«Valine, 
W^Tryptophan, Y»Tyrosine, X*Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSFXSSTRGSLDKPDSFMEETNSQDHRPPSGAQKPAPSPEPCMPTSf - 
TALLIKNPLAATHEFKQACQLCYPKTGPRAGDVTYREGLEHKCK 
RD I LLGRLRSSEDQTWKR I RPRPTKTS P VGS YYLCKDM INKQDC 
KYGDNCTF AYHQEE IDVWTEBRKGTLNRDLLFDPLGG VKRGS I/F 

I AKLLKEHOO I FTFIjPE T n?r> Q Y PP TT<5 VnCD Q W» a*xr * * v 
inAuunonygir 11 UWolUTUOIUrKlAoimi fliJaJi'a VC-SWT ■ A* ft ft 

HSFYNNKCLVHIVRSTSLKYSK1RQFQBHFQFDVCRHEVRYGCL 
REDSCHFAHSFIELKVWLLQQYSGMTHEDIVQESKKYWQQMEAH 
AGXASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGQWEPDKDLK 
YCSAKARHCWTKERRVIiLVMS KAKRKWVSVRPLPSIRNFPQQYD 
I»CXHAQNGRKCQ YVGWCS FAHSPEERDMWTFMKENKI LDMQQT Y 
DMWLKKHNPGKPGEGTPISSREGBKQIQMPTDYADIMMGYHCWL 
CGKNSMSKKQWQQHIQSEKHKEKVFTSDSDASGWAFRFPMGEFR 
LCDRUJKGKAC?IX3DKCRCAHGQEEIiNEWLDRREVLKQKIiAKAR 
KDMLLCPRDDDFGKYNFLLQEDGDIAGATPEAPAAAATATTGE 


5583 


3 


1265 


JO»Wv\^j\r\»Jloiyjlir , Jt±'i'i'KKrtAW V J\c* IK i X L»l LAa V KPS AS PEE 

I KKAYRKLAL KYHPDKNPDEGE KFKL ISQAYEVLSD PKXRDVYD 
QGGEQA1 KEGG SGS P S FS S PMD I FDMFFGGGGRMAR ERRGKNW 
HQLSVTLEDLYNGVTKKLAliQKNVICSKCEGVGGKKGSVEKCPL 
CKGRGmiHJQQlGPGMVQQIQTVCIECKGQGERXUPKDRCSSC 
SGAKVIREKKI I EVHVEKGMKDGQKILFHGEGDQEPELEPGDVI 
IVLDQKDHS VFQRRGHDL IMKMKIQLSEALCGFKlcri KTLDNRI 
LV ITS KAGEV I KHGDLRCVRDEGMP I YKAPLE KGILI IQFLVIF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


S584 


3 H 


1265 


IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSP3FSSPMD1FDMFFGGGGRMARERRGKNW 

hqhsvtledlyngvtkklaljqknvi cekcegvggkkgs vekcpl 
ckgrgmhihiqqigpgmvqqiqtvcieckgqgerinpkdrcesc 
sgakvirekkiievhvekgm:<dgqkilfhgegdqepexepgdvi 
ivldqkdhsvfqrrghdlimkmkiqlsealcgfkktiktldnri 
lvitskagevikhgdlrcvrdegmpiykaplekgiliiqflvif 

PEKHWLSLEKLPQLEAIiLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFZiTNQFSEALSYLKPRTKESM" 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKEAOMLCQRHRRKS 
S VXDS F SSLVKRP TLGQFTEEE IHAEVCYAKCLLQRAALTFLQD 

VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLC^LLLCYHTFLTFVIiGTGNVNIEEAEXLLKPYLNR 
YPKGAI FLFLAGR IE VIKGNlDAAlRRFEECCEAQQfWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADIiLSKENCWSKATYIYMKAAYIj 

smfgkedhkpfgddevelfravpglklkiagkslptekfairks 
rryfssnpislpvpalemmyiwngyavigkqpkiitdgileiitx 
aeeklekgpeneysvddeclvkllkglclkyigrvqeaeenfrs 
i sane kk i kydh yl i pnallelalllmeqdrnee ai klles akq 
nyknysmesrthfriqaatlqaksslbnssrsmvssvsi* 


558* 
5587 


2619 " 
1768 ■ 


915 
148 


lpagtpesslhealdqcmtaldlfltnqfsealsylkprtkesm 
yhslt yat i lemqammtfdpqd i llagnmmkeaqmlcqrhrrks 
svtdsfsslvnrptlgqfteeeihaevcyakcllqraaltflqd 

ENMVS FIKGGI KVRNS YQTYKELDSLVQSSQYCKGENHPHFEGG 

VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 

GHSFRS VLCVMIiLCYHTFLTFVLGTGNVN I EEAEKLLKPYLNR 

YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 

CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 

SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 

RR YFSSNP I SLPVPALEMMYI WNG Y7AVIGKQPKLTDG I LE 1 1 TK 

AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 

ISANEKKIKYDHYLIPNALLELALLLMBQDRNEEAIKLLESAKQ 

NYKNYSMESRTHFR1QAATLQAKSSLENSSRSMVSSVSL 

SSAVPDGAVGRPVAVAVGGPPHS CRCRPCCLMAAIG VHLGCTSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


jwiaeiw owiu ecyuicnL concaxnxriGf sigiitix peptide 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, GMSlycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, MaMethionine, N^Asparagine, 
P^Proline, Q=Glutamine, R=Arginine r 
S»Serine, T-Threonine, V« Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Sto? 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAGWANDAGDRVTPAWAYSENEEiVGLAAKQg^l" 
RNI SNTVMKVKQI LGRSSS DP QAQK YI AESKCLVI EKNG KLR YE 
IDTGEETKFVKPEDVARLI FSKMKBTAHSVLGSOANDWITVPF 
DFGEKQKNALGEAARAAG FNVLRL I HE ?S AALLA YG IG QDS PTG 
KSNILVFKLGGTSLSLSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
TLAQ YLAS EFQRS FfCHDVRGNARAMMKLTNS AEVAKHS LSTLGS 
ANCFLDSLYEGQDFDCNVSRARFELLCS PLFNKCI EAI RGLLDQ 
NGFTADD INKWLCGGSSR I PKLQQLI KDLFPAVELLNS I P PDE 
VI P IGAA I E AG I L IG KEKLL VEDSLM I E CSARD IL VKGVDESGA 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
EETKFAQV\n^DLDKKENGIiRDI LAVLTMKRDGSLHVTCTDQET 
GKCEAISIEIAS 


S5B0 


3 


589 


T PP P PEQAMVAATVAAAWLLLWAAACAQQEQDF YDFKAVNI RGK 
LVSLEKYRGSVSLVVNVASECXSFTDQHYRALQQriQRDLGPHHFN 
VLAFPCNQFGQQE PDS NKB I E S FARRTY S VSFPMFS KIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQI TALVRKL I LLKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
*'KKi'K& j? F5 S FIj FRPI CI*SIjEARPCS I EDRRNWS L IGRPGAPAS 
GLNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 
ATLGAAGQPLGGESI CSARAPAKYS I TFTGKWSQTAFPKQYPLF 
RPPAQWSSIiLGWAHSSDYSMWRKNQYVSNGLRDFAERGEAWALM 
KE I EAAGE ALQS VHAVFSAP AVP SGTGQTS AELEVQRRHS LVS F 
WRIVPS PDWFVG VDSLDLCDGDRWREQAALDLYPYDAGTDSG? 
TFSSPNFATIPQDTVTEITSSSPSHPANSFYYPRLKALPP1ARV 

x umuiR^o «r ivrvr ± trtrntr vLttroKl/Viol VuoAo VjtETPIjDCEVSIjW 

SS WGLCGGHCGRLGTKSRTRYVR VQPANNGS PCPELEEEAECVp 
DNCV 


5590 


72 


896 


LCSSGALRLLPAMVAWRSAFLVC^l?S)Ul^VQRdSGDFBD"F^L~ 
EDAVKE TS S VKQP WDHTTTTTTNRPG TTRAPAK P PGS GLDLADA 
LDDQDDGRRKPG IGGPJ2RWNHVTTTTKR PVTTRAPANTLGND FD 
IiADALDDRNDRDDGRRKPIAGGGGFSDKDLEDXVGGGEYKPDKG 
KGDGRYG SND DPGSGMVAEPGTI AG VASALAMAliI GAVSS Y I S Y 
QQKKFCFS I QQGLNAD YVKGENLEAWCBE PQVKYSTLHTQSAE 
PPPPPEPARI 


"5591 


68 


1494 


AGSSRRAAAERLLVSAGCRSLAGRASGVLLLPAELLPGEEEAMA ' 
IJiVTRNSKIMAENKAKIMMAGAJaiVPTAPAlVTQWDr'T dddtst r> 

DIGNKVS EQLQAKM PMKKEAKP S ATGKV I DKKCP KPLEKVP MLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEP IIjVDTASPS PMETSG 
C^ABSDLCQAFSD VIIAV^VDAEDGADPNLCSBYVICDI YA YL 
RQLEEEQAVRPKYLLGREVTGNMRAILIDWLVQVQMKFRLLQET 
mMTVSIIDRFMQNNCVPKKMriQLVGVTAMFIASKYEEMYPPBI 
GDFAFVTDNTYTKHQIRQMEMKI LRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTLAKYLMELT>ILDYDMVHFPPSQIAAGAFCLALK 
I LDNGE WTPTLQHYLS YTEESLL PVMQHLAKHAAMVNQGLTKHM 
TVKNKYATSKHAKISTLPQLNSALVQDLAKAVAKV 


5592 


242 


924 


YGES KDWNQKDLLSALVLTTVNCLPTP IMAKSAEVKLAI FGRAG ' 
VGKS AI/VVRFLTKR FI WEYDPTLESTYRHQATIDDE WSMEILD 
TAGQEDT IQR EGHMRWGEGFVLVYDI TDRGSFEEVIiPLKNI LDE 
IKKPKNVTLILVGNKADLDHSRQVSTEEGEKLATELACAFYECS 
ACTGEGNITE I FYELCREVRRRRMVQGKTRRRSSTTHVKQAINK 
MLTKISS 


5593 


3 


1113 " * 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH " 
SSGIVADLSEQSLKDGEERGEBDPEEEHELPVDMETINLDRDAE 
DVDLNHYRlGKlEGFEVIjKKVICTLCIiRQNljI KCl ENLEELQSLR 
BLDLYDNQIJCKIENLEALTELEILDISFNLLRNIEGVDKLTRLK 
KL FLVNNKI S KIENLSNLHQLQMLEIjGSNRI RAI EN I DTLTNLB 
S L FLG KNKI TKLQNLDALTNL T VLS MQSNR^TKIEG LQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPIiQKDPQYRRKV 



348 



WO 01/53312 



PCT/US00/34263 
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ID 
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Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, KsLysine, 
L«Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q-Glutamine, R»Arginine, 
S-Scrine, T=Threonine, V»Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVBSEESGDEEGKKH 
SSG I VADLSEQS LKDGEERGEEDPEEEHELPVDMET I NLDRDAE 
DVDIjNHYRIGKIEGFEVLKKVKTIiCXiRQNLIKCIENLEELQSLR 
ELDLYDNQ I KK I ENLE ALTELE I LDI SFNLLRN I EGVDKLTRLK 
KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAIENIDTLTNLE 
SL FLG KNKI T KLQNLDALTNLTVLSMQSNRLTKI BGLQNLVNLR 
ELYLSHNQIEVIEGLENNNKLTMLDIASNRIKKIENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQIDATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGE^QRQLPRAWRPVGRTLGSE 
PI ALAWSPPLYLFPI PLPSWAVSQPTPTLGTMFADLDYDI EEDK 
LGI PTVPGKVTLQKDAQNIiIGIS IGGGAQYCPCIiYIVQVFDNTP 
AALDGTVAAGDE I TG VNGRS I KGKTKVE VAKM I QE VKGE VT I HY 
NKLQADPKQGMSLDIVLKKVKHRLVENMSSGTADALGLSRAILC 
NDGLVKRLEELERTAELYKGMTEHTKNLLRAFYELSQTHRAFGD 
VFSVI G v RE PQPAASEAFVKFADAHRS IEKFG I RLLKT I K PMLT 
DLNTYLNKAI PDTRLTIKKYLDVKFEYLSYCLKVKEMDDEEYSC 
rALGEPLYRVSTGNYEYRLILRCRQEARARFSQMRKDVLEKMEL 
LDQKHVQD I VFQLQRLVSTMS KYYNDCYAVLRDAD VFPI EVDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


559S 


698 


219 


GAVIAPSSLPAAELAAQGESQSLBDLSNTSRPTSEVYKISFIFP 
NGDKYDGDCTRItS SG I YERNGIG IHTTPNG I VYTGS WKDDKMNG 
FGRLEHFSGAVYEGQFKDNMFHGLGTYTFPNGAKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


' 5597 


3 


731 


I SCKMAADGQSSLP AS WRS VTLTHVEYPAGDLSGHLLA YLSLS P 
VFVIVGFVTLIIFKRELHTISFLGGIALNEGVNWLIKNVIQEPR 
PCX3GPHTAVGTKYGMPSSHSQFMWFFSVYS FLFLY LRMHQTNNA 
RFIJDLLW RHVLSLGLLAVAFLVS YSR VYLLYHTWS QVLYGGI AG 
GLMAIAWFIFTQEVLTPLFPRIAAWPVSEFFIjIRDTSLIPNVLW 
FEYTVTRAEAHNRQRKLGTKLQ 


5598 




2440 


GIGPIAASFIFCKVASIiYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVPPHPPAPSPCCSGO/TMLK^SPICLLLLAVALGFFEG 
DAKFGERNEG5GARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRXSCCLRSDSPGLGRLEWXIFSVTNNTECGKLLEE 
I KCALCS PHSQS LFHS PEREVLERDLVLPLLCKD YC KEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEE YDKVEEI SRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEI FKEP YLDIHKLVQSGIKGGDERGLL 
SliAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRVVEYTVSRK 
NPHQVDLRTAR VFLE VAELHR KHLGGQLL FGPDGFLY 1 1 LGDGM 
ITLDDMEEMDGLSDFTGSVLRiiDVDTDMCNVPySI PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTD I N INLTIL C SD SNG KN RS 
SARILQI I KGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQBKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTS ECS RLCRNGYCTPTGKCCCSPGWEGDFCRTG 


S599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPAWSSWSCAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKLLLLAVALGFFEG 

EMLCGG FYPRLSCCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHSQS LFHS PERBVLERDLVIiPIiLCKDYCKBFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
SLAFHPN Y KKNGKLYVS YTTNQERWAIGPHDHI LRWE YTVSRK 
NPHQVDliRTARVFLEVAELHRKHLGGQLLFGPDGFLYI I LGDGM 
I TliDDMEEMDGLSDFTGS VLRLDVDTDMCNVP YS IPRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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beginning 
nucleotide 
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uorre b }J oijuiny 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DoAspartic Acid, e« 
Glutamic Acid, Fa Phenylalanine, G»Glycine, 
H=Histidine, I«Isoleucine, K-Lysine, 
LaLeucine, NUMethionine, NoAsparagine, 
P«Proline, Q=Glutamine, R«=Arginine, 
S^Serine, T=Threonine, VoValine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S AR I LQI I KGKD YES EPS LLE FKP FS NGPL VGG FVYRG CQS ER L 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
K I LG FGEDE LGEVY I LS S S KSMTQTHNGKL YKI VDPKRPLMPEE 
CRAT VQPAQTLTS ECSRLCRNGYCTPTGKCCCS PGWEGDFCRTG 


5600 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EEMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELS VAQKPEKLLERCKYWPACKNGDECAYHHPI S PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCP FTHVSRRI PVLS PKP 
AVAPPAPPSS SQLCR YFPACXKMBCPP YHPKHCR FNTQCTR PDC 
TFYHPTINVPPRHALKWIRPQTSB 


bbOl 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPAS PKFI VTLDGVPS PPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMS ELS VAQ KPEKLLERCKYW PACKNGDECAYHH PI S P CKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLS PKP 
AVAP PAP PSS S QLCRY FPACKKME CPF YHPKHCR FNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSB 


5602 


246 


766 


YHTS CTVWRTAKEALEOTEVPVGCLtWYN^ 
NATRHAEMVA IDQVLDWCRQSGKS PS E VFEHTVL YVTVEPC I MC 
AAALRLMKI PLWYGCQNE R FGG CGS VL5TI AS ADLPNTGRP FQC 
I PGYRAEEAVEMLKTF YKQENPNAPKS KVRKKEOQQ ILNMF 


5603 


1 


565 


FRGRT P I SGGERGCAQYP I PATPARSGENRTMPGAGDGGKAPAR 
WLCTGLLGLFLLP VTLSLE VS VGKATD I YAYNG TE I LLPCTFS S 
CFGFEDLHFRWTYNSSDAFKI LIEGTVKNEKSDP KVTLKDDDR I 
TLVGSTKEKRNN1S I VLRDLE FSDTGKYTCHVKNP KENNLQHHA 
TIFLQWDRRMQ 


5604 


1 


1506 


edifpaqllklqrhervwqqeppvrdhrswggsgAggvagrewt 
dcjgqvalgghymaegegy famsedelacs p yi plggdfgggd fg 
ggdfgggdfgggdfggggsfgghcldycesptahcnvlnweqvq 

RLDGILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKRIGVR 
DVRLNGSAASHVLHQDSGLGYKDLDL I FCADLRGEGE FQTVKDV 
VLDCIiLDFLPEGWKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFQI KLDS LLLFYECSE 
NPMTETFHPTIIGESVYGDFQEAFDHLCWKIIATRNPEEIRGGG 
LLKYCNLLVRGFR PAS DE I KTLQRYMCSRFF I D FS D I GEQQRKL 
ESYLC^FVGLEDRKYEYLMTLHGVVNESWCI^IGHERRQTLNL 
ITMLAIRVLADQNVI PNVANVTC Y YQ PAP YVAD ANFSN YYXAQV 
QPVFTCQQQTYSTWLPCN | 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAP VRLGRKR PLPACPNPLF VRWLTEWRDEATRSRHRTR FVFQ 
KALRSLRRYPLPLRSGKEAKILQHFGDGLCRMLDERLQRHRTSG 
GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP 
ARHS GARVI LLVLYREHLNPNGHHFLTKE ELLQRCAQKS PR VAP 
GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGLELAQKLAESE 
GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLELRP 
GEYRVLLCVD I G ETRGGGHRPELLREIX2RLBOTHTVRKLHVGDF 
VWVAQETNPRDPAN PG ELVLDH I VERKRLDDLCS S 1 1 DGRFREQ 
KFRL XRCX3LERRVYLVEEHGS VHNLS LPESTLLQAVTNTQVI DG 
FFVKRTADIKESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
SG AMT3PN PLCS LLTFS DFNAGAIKNKAQS VREVFARQLMQ VRG 
VSGEKAAALVDRYSTPASLLAAYDACATPKEQETLLSTIKCX5RL 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCI^SLRLLVFAVFSAAASNWLYLAK 
LSSVGS ISEEETCEKLKG.LIQRQV^MCKRNLEVMDSVRRGAQLA 
IEECQYQFRNRRWNCS TJjDSLPVFGKVVTCXSTREAAFVYAI S SA 
GVAFAVTRACS SGELEKCGCDRTVHGVSPQGFQWSG CS DN I AYG 
VAFSQS FVDVRERS KG ASSSRALMNLHNNEAGRKAI LTHMRVEC 
KCHGVSGS CE VKT C WRAVPP FRQ VGHALKEKFDGATBVE PRR VG 
S S RAL VPRNAQ FKPHTDEDL VY LE PS PDFCEQDMRSGVLGTRGR 
TCNKTS KAIDGCELLCCX5RGFHTAQVELAERCS CKFHWCC FVKC 
RQCQRLVELHTCR 
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ID 
MO: 


Predicted 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rMii4.uu ciwxu ocgraent containing signal peptide 
{A=Alanine, C«Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=»Methionine, N=*Asparagine , 
PaProline, Q-Glut amine, R=Arginine, 
S=Serine, T«=Threonine, V«Valine, 
^-Tryptophan, Y=Tyrosine, X«=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PP VCNPAEAMPS PGT VCSLL1.LGM I iWLDLAMAGSSFLSPEHQR 1 ^ - 
QQRKBSKKP PAKLQ PRALAGWLRP BDGGQAEGAEDELE VRFNAP 
FDVGIKLSGVQYQQHSQALGKFLQDILWEEAKEAPADK 


5608 


2 


983 


WFQSPIJIQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
RIQTEPKYTGIWHCVRDTYHRERVWGFYRGLLLPVCTVSLVSSE 
VFGTYRHCLAH ICRLRFGNPDAKPTKAD I TLSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPIAVPPMCPVPPACPEP 
KYRGPLHCLATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 
VLCEWLSPAGHSRPDVPGVLVAGGCAGVLAWAVATPMDVIKSRL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 
MWFVAYBAVLRLARGLLT 


5609 


1628 


304 


AKGVW VLP S P P PRPGRGALVSGSGIiRRGRSGTSWRPRRMNHKSK 
i\.\^i\ijf-uixz\ujrtj\ir nuivL/aLiuw i KnU i xcior oijSPAAVADNVERAD 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERIiKRKY 
RNQKFKCGEDNDGYSVKMKMKYY1EYMESTRDDSPLYIFDSSYG 
EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGXH I DPLGTSAWNALVQGHKRWCLFPTSTPRELIKVTRDEGG 
NQQDEAITWFNVI YPRTQLPTW P PEFKPLE I LQKPGETVFVPGG 
WWII WIiNIjDTTIAI TQNFAS STNFP WWHKTVRGRPKLS RKW YR 
ILKQEHPELAVIADSVDIiQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 

R 


5610 


54 


1196 


LERTPASADMAWTKYQLFLAGI^LVTdrSiMTIjSAKWADNFMAEG 
CGGSKEHS FQHPFLQAVGM FLGE FS CLAAFYLLRCRAAG QSDSS 
VDPQQPFNPLLFLPPALCDMTGTSIJWYVAIiNMTSASSFQMLRGA 
VE I F TG LFS VAFLGRRL VLS Q WLG I LAT I AGL VWG LAD LLS KH 
i/oyniujocivi ± uujjDiiMAyl J. VAiyMVIiEEKFVYKHNVHPIiRA 
VGTEGLFGFVILSLLLVPMYYIPAGSFSGNPRGTLEDAIjDAFCQ 
VGQQPLI AVALLGNI SS I AFFNFAG I S VTKELSATTRMVLDSLR 
TWI WALS LALG WEAFHALQ I LGFLILL IGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 


<M 


FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPALAMSG 
ELSNRFQGGKAFGLLKARQERRLiAEINREFLCDQKYSDEENLPE 
KXTAFKEKYNEFDL^EGEIDLMSIiKRMMEKLGVPKTHLEMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVI,KLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPEMPQYGEECJHIFELMQAMWLCK«LNS 
SLLTLENLI LNE FS YTATEARRLYLQRKT VPSALLVQL I QERLA 
EEDCI KQGWILDGI PETREQALRIQTLG ITPRHVI VLSAPDTVL 
IERNLGKRIDPQTGBIYHTTFDWPPESEIQNRLMVTEDISBLET 
AQKLLEYTiRNIVRVIPSYPKILKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVLLliGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKF1JLFGKISGWFRSILSD 
KTSRKLFFFLCLNLSFAFVBliLYGIWSNCLGLISDSFHMFFDST 
AILAGLAAS VI S KWRDNDAFS YGYVRAEVLAGFVNGLFLI FTAF 
F I FSEGVERALAPPDVHHERLLLVS ILGFVVMIilG I FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDGPSLKETTGPSRQILQGVFLHIIiADTLGSIGVI 
AS AIKMQNFGLMXADPI CSIL I AI L I WSVI PLLRE S VG I LMQR 

TPPLLBNSLPQCYQRVQQLQGVYSLQEQHFWTLCSDVYVGTIiKL 
I VAPDADARW I LSQTHNI FTQAG VRQLYVQ IDFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
APWGARQRLG VMAELQQLQEFEI PTGREALRGNHSALLRVADYC 
EDNYVQATDKRKALEETMAFTTQALASVAYQVGNLAGHTLRMXiD 
L0^AALRQVEARVSTUK3MVNMHMEKVARREIGTLATVQRLPPG 
QKVXAPENL P PLTP YCRRPLNFGCLDD I GHGI KDLSTQLSRTGT 
LSRKS I KAPATPASATLGRP PR I PEPVHLPWPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWVPASYI^KVVTLYPYTSQKDNELSFSEGTVICVTRRY 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


■3 ** ,w * ~ wvi»wi»*»i*tiy oxyiiaj. peptide 
(A«Alanine, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P*Proline, Q=Glutaraine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 
SDGWCEGVSSBGTGFFPGNYVEPSC — 


5615 


9 


1558 


AUSRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLliiEgNF - 

DGTSDEEHEQELLPVQKHYQLDDQEGISPVQTMHLLKGNIGTG 

LLGLPLAIKNAGIVLGPISLVPIGIISVHCMHILVRCSHFLCLR 

F KKSTLG YS DTVS FAME VS PWS CLQKQAAWGRS WDFFLVITQIj 

GFC5VYIVPLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 

LRIYMLCFLPFIILLVFIRELKNIjFVLSFLANVSMAVSLVIIYQ 

YWRNK PDPHNLP I VAGWKKYPIjFFGTAVFAFEGIG WLPLENQ 

mkeskkfpqalnigwgivttlyvtlatlgymcfhdeikgsitln 
lpqdvwlyqsvkilysfgifvtysiqfyvpaeiiipgitskfht 
kwkqicefgirsflvsitcagailiprldivisfvgavssstla 
lilpplveiltfskehyniwmvlknisiaftgwgfllgtyitv 

EE 1 1 YPTP KWAGTPQS P FLNIiNSTCLTSGLK 


5616 


1 


719 


ddfvrcgpqsaamgasarllravimgapgsgkgtvssritthfe 
lkhls sgdllrdnmlrgte igvlakafi dqgkli pddvmtrlal 

HEIiKNLTQYSWLIiDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 

ikqrltarwihpasgrvyniefnppktvgiddltgepliqredd 
kpetvikrlkayedqtkpvleyyqkkgvletfsgtetnkiwpyv 

YAFLQTKVPQRSQKASVTP 


5S17 


176 


765 


pwrgrgsrprgagamaeeqvnrsaglapdceasatabttvssvg 
tceaagkspepkdydstcvfcriagrqdpgtellhcenedlicf 
kdikpaathhylwpkkhigncrtlrkdqvelvenmvtvgktil 
epj^^tdftnvrmgfhmppfcsishlhlhvlapvi^lgfi^klv 
yrvnsywfitadhlieklrt 


5618 


3 


1692 


ylnyinlksenklsgkedlweklqyi.wkstlnlpedi^rvpdes 
lflnsggdslksirllsei eklvgts vpgllei ilsss ILE I YN 
h i lqtwpdedvtfrks catkrklsn inqeeas gtslhqka i mt 
ftchneinafwlsrgsqilsujstrfltklghcssacpsdsvs 
qtniqnlkglns pvligks kdpscvakvseegkpaigtqkmelh 
vrwrsdtgkcvdasplwiptfdkssttvyigshshrmkavdfy 
sgkvkw2qi lgdr i essacvs kcgnf iwg cyngl vyvlks nsg 
ekywmfttedavkssatmdptipgli y1gshdqhayaldi yrkkc 

VWKS KCGGTVFSS PCLNLI PHHLYFATLGGLLLAVNPATGNVI W 
KHS CGKPLFS S PQCCSQY I CZGCVDGNLLCFTHFGEQ VWQFS TS 
GP 1 FS S PCTSPSEQKI FFGSHDCFI Y CCNMKGHLQWKFETTS RV 
YAT P FAFHNYNG SNEMLLAAAS TDGKVW I LESQSGQLQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSIiGSPPGAGRGCPCP 
AQSLHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGSPSSPGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 

TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620 


$30 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAIEAIKLGST ■ 
AIGIQTSEGVCIiAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 
L I ADAKTL I DKAR VETQNHWFT YNETMT VE SVTQAVSNLALQ FG 
EEDADPGAMS RP FGVALLFGG VDEKGPQLFHMDPSGTF VQCDAR 
AIGSAS EGAQSS LQE VYHKSMTLKEAI KSS LI I LKQVM EGKLNA 
TNIELATVQPGQNFHMFTKBELEEVIKDI 


5621 
~^6T2 


3 


819 


WEF VE YTATDAN VKNESfcS S VQQLGI KMTVRYGKFL5 LLKDGA 
ENDLTWVLKHCERFliKOOOTS I KSSLLCLflfJNrVTiaw nu ra ot r» 

MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
GIHPVYFCSTHYIEWLLKAELPLVFSAFHMSGFAPSQICLQKIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 
HTQTQDLQVFLKEEALHGFRVSDYFEYMEILEQNYRTVLLRDMR 
NIRLQST 




1122 


456 


AASTKDAVSRKRSHSAS EKSGTGTS IS KRLNMNFQ IRNPMKAMY 
PGTFYFQFKNLWEANDRNETWLC FTVEG I KRRS WS W KTG VFRN 
QVDSETHCHAERCFLSWFCDDILSPNTKYQVTWYTSWSPCPDCA 
GE VAEF LARHSNVNLTI FTARL Y YFQYP C YQEGLRS LS QEGVAV 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NnAsparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T«Threonine, v»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








BIMDySDFKyCWENFVYNDNEPFKPWKGLKl^yRLLKRRLRfiSt ' 
Q 


5623 


3 


954 ■ 


FLP FF I RA PKi SkNGQWLFTFTTP FP FANKAL PGWEG I VPACFH 1 

RKKILTPSTGTMELLQVTILFLLPSICSSNSTGVLEAANNSLW 

TTTKPSITTPNTESIiQKNVVTPTTGTTPKGTITNEIiLKMSLMST 

ATFLTSKDBGLKATTTDVRKNDS I ISNVTVTS VTLPNAVSTLQS 

SKPKTETQSS IKTTEI PGS VLQPDASPSKTGTLTSI PVTI PENT 

SQSQV I GTEGGXNASTSATSRS YS S 1 1 LPWIAL I VITLS VFVL 

VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN ^ [ 


5624 
5625 


159 " 
1 


898 


PG VAAAAGALPQ YHGPAPAL VSCRR ELS LS AGS LQLERKRRDFT 1 
SSGSRKLYFDTHALVCLIiEDNGFATQQAEI I VSALVKI LEANED 
IVYKDMVTKMQQEITFQQVMSQIANVKiOMIILEKSEFSALRAE 
NBKI KLELHQIiKQQVMDBVI KVRTDTKLDFNLEKSRVKBL YS LN 
EKKLLELRTEIVALHAQQDRALTQTDRKIETBVAGLKTMLESHK 
LDNIKYLAGS I FTCLTVALG F YRLW I | 






1180 


I'l PSSAAAQRAG P PA1)«ALKALS PGGARAHABRRGEMRATPLAAP 
AGSLSRXiCRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTBYTCR 
VYPVQEALAVLE P YARLP PHKHVARPTE VLAGTQLLY7AFFTRTH 
GDMHS LVRSRHR 1 PEPE AAVL FRQMATALAHCHQHGL VLRDLKL 
CRFVFADRERKKLVLENLEDSCVLTGPDDSLWDKHACPAYVGPE 
ILSSRASYSGKAADVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRGA Y ALPAGLS APARCLVRCLLRRE PAERLTATG I LLHP WLRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 
5*27 


3123 


2611 


v f KAX/ib VAMENQ VLTPH VYWAQRHRE L YLRVELSD VQNPAISI| 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
ITVQKKVSQWWBRLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERIiNKLRIiESEGSPETLTNLRKGYLFMYNIiVQFLGFSWI FVN 
LTVRFClLGKESFYDTFHTVADMiMYFCQMLAWBTINAAIGVTT 
SPVLPSLIQLLGRNFILFI I FGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTCIDMDWKVLTWIJIYTLWI PLYPLGCLAEAVS VIQ 
SrpiFNETGRFSFTLPYPVKIKVRFSFFIiQIYLIMIFLGLYlNF 
RHLYKQRRRRYGQKKKKIH | 




3123 
75 4 


2011 


P PRALGS VAMEKQ VLTPHV YWAQRHREL YLRVELS DVQNPAI SI 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVRFCILGKBSFYDTFHTVAOMMYFC^MIAVVETINAAXGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYSFYMLTCIDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 
SIP IFNETGRFSFTLP YP VKI KVRFSFFLQI YLIMI FLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5628 




1455 


VA(3AMASKCLKAGFSSGSLKSPGGASGGS1^VSAMYSSSPCKLpH 
SLS P VARS FSACSVGLGRSS YRATS CLPALCLPAGGFATS YSGG 

GGWFGEGILTONEKETKQSLNDRIAGYLEKVRQLEQENASLESR 

IREWCEQQVPYMCPDYQSYFRTIEELQKKTLCSKAENARLWEI 

DNAKLAADDFRTKYBTEVSLRQLVESDINGLRRILDDLTLCKSD 

LEAQVESLKEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 

DliNRVLEEMRCQYBTLVENNRRDAEDWLDTQSEELNQQVVSSSB 
QliQSCQAEIIELRRTVNAT.RTFT.nan'HQivrDnaT ttctt hohotmw 1 

SSQLAQKQCMITNVEAQLAEIRADLER^QEYQVLLDVRARLEC * 

BINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCSARPICVPCPGGRF " 4 j 


5629 


2287 


938 

i 
\ 
1 
1 


GRPRS S SDNRN F LRERAGLS SAAVQTR I GNSAAS RRS PAAR PFV* 
PAPPALPRGRPGTEGSTSLSAPAVLWAVAVVVVVVSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
aBVTLQLFTDG I TNKL IGCYVGNTMED WLVR I YGNKTE LLVDR 
□EE VKS FRVLQAHGCAPQLYCTFNNGL C YE F IQGEALD PKHVCN 
PAI FRL I ARQLAXIHAI HAHNGWI PKSNLWLKMGKYFSL I PTGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


^ w "—3 uic ** >- uun&aj.iui:9 signal peptide 
{A=Alaniae, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, Methionine, N^Asparagine , t 
P« Proline, QaGlut amine R«Aro-lninA 
S*»Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








>UJKUlNKRFLSDIPSSQIIiQkHMTWMi^^]^SNliGSPVVLQiNDL~ j 
LCKNIIYNEKQGDVQPIDYEYSGYNYLAYDIGNHPNEFAGVSDV 
DYSLYPDRELQSQ??LRAYLEAYKEFKGPGTEVTPKT'vrtt mmr 

NQFALASHPFWGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMKP 
BVTALKVPE | 


5630 


1194 


278 


GFWA I AOTC^UIHLPPGSPWLVPASPWRLPEMSS FGYRT£tw5lp I 

TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 

GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
KSNVSVYQPPRQVILTLOPTLVAVGlf <! PTTRr»Dvt5tn/i7T»T norm 

LF LFRGNE TLHYE TFGKAA P APQEATATFNS TADREDGHRNFS C 

LAVLDLMSRGGNI FHKHSA P KMLE I YE P VSDSQMV 1 1 VTWS VL 

_ L3 IiFVTS VLLCF I FGQHLRQQRMGTY G VRAAWRRLPO A PR P | 


5631 
5632 ' 


1053 


! 290 


SRVDDFVRPEP S RAEPSRSGRRRPARKAATMSVF<gkL.FGA6GGK 
AGKGGPTPQBA IQRLRDTEEMLS KKQEFLEKKI EQELTAAKKHG 
TKKKRAALQALKRKKRYEKQLAQIDGTLSTIEFQREALENANTN 

TE VLKNMG YAAKAMKAAHDNMDIDKVDELMQDIAI>QQELAEE IS 
TA.ISKPVGFGEEFDEDETiMAPr.PPT.PTiwwT tvtmt t 010^^™^ 1 

LPNVPS IALPSKPAKKKEEEDDDMKEliENWAGSM ------ | 


5633' 


3 


952 


wwwspprrlwwgsi^aaqrpavpvsgiju^i^etrrphrraH 

SVRVARGRXGVWAOPOPl^PRPVnQRRRMnDT>r , DDDnv'7inrnxTi-.T^ 1 

FTFVSSADAEDLSGSIASPDVKLNLGGDFIKESTATTFLRQRGY 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYK1RCVLMPMPSLGF 
NRQ WRDNPDFWGPLAWLFFSMISLYGQFRWS WI ITIW1FGS 

LTIPLI^VLGGEVAYGQVLGVIGYSLLPLIVIAPVLLVVGSFE 
WSTLIKLFGVFWAAYSAASLriVQPPPVTJOfDT TTvnrmrvw 1 
FLSXYTGV 


5634 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQl^SSHljHQVPFFCCFlVvCLCNI 
CLFBNS VSKLYMLCFNFFMS I FFYSLS ITKUJLI YLWGLS YQSL 
LLLLLSGHRPWGSSMV 


5635 ' 


1446 


855 


PKATGR I RSRAAAS RPRAGAGASGAB PREGRER3RLSGRRAPAM 1 
ARNTLS SRFRRVD I DEFDENKFVDEQBEAAAAAAEPGPDPSEVD 
GLLRQGDMLRAFHAAIjRNS P VNTKNQAVXERAQGWLKVLTNFK 
SSEIEQAVOSLDRNGVDlALMJTYTVTfflPPVD'TBTacenTrT r \ 
ALAVGGLGS I IRVLTARKTV 




3 


• 943 


UHGPKS TATDTGRAk VS FWRFPXtbPGVltKf^NVQl S^EKRRFRTL 1 
RSLFHPFPVTRSGAPRAVLVGSSWPAKMVAPAVKVARGWSGLAL 
GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 
EKYVRELKKTQLI K7AAPAGKTSS VFEDP VI SKFTNMMMIGGNKV 
LARSLMIQTLEAVKRKQFEKYHAASAESQATIERNPYTIFHQAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 
KKHQRTLMPEKLSHKLLEAFHNQGPVIKRKHDLHKMAEANRAIA 
HYRWW | 


5636 j 
5637 


2253 


1143 


IjSDTICQHP PAEKKxj Y^HR KLRE VPJRivGI PRLPKD VFMDTHQG 
LTDVRAKVTGFSEG WDS VKGGFS S FS Q ATHSAAGA WS KPRE I 
ASLIRNKFGSADNIPNLKDSLEEGQVDlWSKALaVISNFQSSPK 
YGS EEDCS SATSGSVGANSTTGGIAVGASSSKTNTLDMQSSGFD 
ALLHE IQE IRETQARLEES FE TLKEHYQRD YSL I MQTLQEER YR 
CERLEEQLNDLTEIiHQNE I LNLKQELASME EK I AYQS Y ERARDZ 
QE AliEACQTR I S KMELQQQC3QQ WQLEGLENATARNLLGKL IN I 

L1JVVMAVLLVFVSTVANCVVPLMKTRNRTFSTLFLVVFIAFLWK 
HWDALFS YVERFFSS PR 




948 


2532 


MS FCGARANAKMMAAYNGGTS aaaaghhhhhhhhlphi2p~p^^ — 1 
HHHHPQHHLHPGSAAAVHPVOQHTSSAAAAAAAAAAAAAMLNPG 
QQQPYFPS PAPGQAP<3PAAAAPAQVQAAAAATVKAHHHQHSHHP 
QQQLDI EPDR P IGYGAFGVVWSVTD PREX3KRVALKKMPNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDILQPPHIDYFEE I YWTE 
LMQSDLHKIIVSPQPLSSDHVKWLYQILRGiKYLHSAGILHRD 
r KPGNLLVNSNCVLKICD FGLAR VEEIJDESRHMTQE VVTQ YYRA 
?EILI4GSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIQQI i DL 
rTDLLGTPSLEAMRTACEGAKAHILRGPHKQPSLPVIiYTLSSQA 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A*Alanine, CaCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phanylalanine, G^Glycine, 
H^Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M*Methionine, N»Asparagine, 
P=»Proline, Q=Glutamine # R«Arginine, 
S=Serine, T=Threonine, V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








THEAVHLLCRMLVFDPYKRISAIO>AIAHPYLDEGRLRYHTCMCK 
CCFSTSTGRVYTSDFBPVTNPKPDDTPEKNIjSSVRQVKE I IHQP 
ILEQQKGNRVPLCINPQSAAFKSPISSTVAQPSEMPPSPLVWE 


563B 


125 


1155 


DRKMSELDQIjRQEAEQLKNQIRDARKACADATIiSQITNNIDPVG 
RIQMRTRRTLRGHLAKI YAMHWGTDSRLLVS ASQDGKL 1 1 WDSY 
TTNKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGNVRVSRBLAGHTG YLS CCRFLDDNQXVTSS GDTTCALWDIET 
GO^TTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFF PNGNAPATGSDDATCRLFDLRADQEL 
KTYSHDNI ICG ITSVSFSKSGRLLLAGYDDFNCNVWDALKADRA 
GVLAGHDNRVS CLG VTDDGMAVATGS WDS FLKI WN 


5639 


125. 


1155 "■ 


DRKMSELDQLRQEAEQLKNQIRDARKACAJDATLSQITNNIDPVG 
RI QMRTRRTLRGHLAKI YAMHWGTDSRLLVSASQDGKLI IWDSY 
TTNKVHAIPLRS SW VMTCAYAPSGNY VACGGLDN I CS I YNLKTR 
EGNVRVSRELAGHTGYLS CCRFLDDNQ I VTS SGDTTCA1*WDI ET 
GQQTTTPTCHTGDVMSLS1APDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNIICGITSVSFSKSGRLLLAGYDDFNCNVWE3ALKAJDRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5640 


200 


1092 


Q(^NKKTMLSHWTMWKQRKQQATAIMK2VHGNDVDGMDLGKKVS 
IPRDIMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSG PLKEI P PEKFNTTAVPKYYQSPWEQAI S NDPELLBALYPK 
bFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
LTDPR FMSFVNPLSGRRS FNRTPKGWI S ENI P I VI TTEPTDDTT 
VPESEDL 


5S41 


27 


332 


CRHNCNGDVKLLSNQMDKLFAFHLFTFHGLLHFLDGSlOKLIQA 
EIILSDHSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 


ITPCRMDFLVL Fh F YIiAS VLMGLVL I CVCSKTHS LKGLARGGAQ 
IFSCI IPECI^RAMHGLLHYIjFHTRNHTFIVLHLVLQGMVYTEY 
TWEVFGYCQELELSLHYLLLPYIiLLGVNLFFFTLTCGTNPGIIT 
KAKB LLFIiHVYE FDEVMFP KNVRCSTCDLiRKPARSKHCS VCNWC 
VHR FDHHCVWVNNCIGAWN IR YFL I YVLTIiTASAATVAI VSTTF 
LVHLVVMSDLYQETYIDDLGHLHVrdDTVFLIQYLFLTFPRlVFM 
LGFVWLS FLLGGYLLFVLYLAATNQTTNEWYRGDWAWCQRCPL 
VAWPPSAEPQVHRNIHSHGLRSNLQEI FLPAFPCHERKKQB 


5643 


1 


847 


PSGGVRDVETRGPGSRAARG PRVVMKRRGVGAGA I AKKKLAEAK 
YKERGTVLAEDQLAQMSKQLDM FKTNLEE FASKHKQEIRKNPEF 
RVQ FQDMCATIG VDPLASGKGFWSEMLGVGDFYYELGVQI I E VC 
LALKHRNGGLITLEEliHQQVLKGRGKFAQDVSQDDL I RAI KKLK 
ALGTGFG 1 1 PVGGTYLIQS VPAELNMDHTWLQLAEKNGYVTVS 
EIKASLKWEI^RARQVLEHLLKEGLAWLDLQAPGEAHYWLPALF 
TDLYS QE I TAEEAREALP 


5644 




inn 


PRRMGSV7VQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNIiTP 
YVAIEDKDMQQKEQQFREWFLKEFPQIRWKIQESIERLRVIANE 
I EKVHRGCVI ANWSGSTGILSVIGVMLAPFTAGLSLS ITAAGV 
GLG IAS ATAG I AS S IVENTYTRSAELTASRLTATSTDQLEALRD 
ILHDITPKVLSFAbDFDEATKMIANDVHTLRRSKATVGRPljIAW 
RYVPINWETLRTRQAPTR TVR in/ARNT /iKATQfiVTAnrr.nmjMT 

VQDSLDIiHXGEK^ESAELIJlQWAQELEENIil^LTHIHOSLKAG 


5645 


537 


799 


VQS VRDIjKR LSPTD P PGDSGNRD VTRED P VTGPLNSAS SQ VPTL 
YLCLQNSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTSPHIiLFCMLLSSCLPPAjtOTTKAXTPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS I PATKRAS FLSS FI KMFFEE JjB YILGF 
LSUI^FHVHVSVYSAICHFQKEGTGNSRSFTCTPELFPRLQTHIi 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPBEGCSIiH 
EEDTQRHETYHQQGQCQVLVQRS PWLMMRMGILGRGLQE YQLP Y 
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SEQ- 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rt - mno acia segment containing signal peptide 
<A=Alanine, 0=Cysteine. D=Aspartic Acid, E» 

GlUtailllC Acid RtDKa^/l ?1 tni J-l m • 

* n *- J - u # r=fnenyiaj.anine , G=Glycine , 
HaHistidine, I=Isoleucine, K-Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P-Proline, Q«Glutamine, R=Arginine, 
SaSerine, T=»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
V-possible nucleotide insertion) 








QRVLPLPIKrPAKMGATKEEREDTPIQLQELLALETALGQQCVD 
RQE VAE I T KQ LP P WP VSK PGALRRSLS RS MS QEAQRG 


5648 


7 


1518 


VLSELCGRHEAIiREVGAEWPPPTCSPKlCSGJbQQAGNTDWSLTM 
APQSLPSSRMAPIX3MLLGLLMAACFTFCLSHQNLKEPALTNPEK 
ooiAnx is KKJL l ELDAE VLE VKHP THE WQALQ PG QA VPAGS 
HVRLNLQTGEREAKLQYEDKFRKNLKGK^LDINTNTYTSQDLKS 
ALAK FBCEGAEMESS KED KARQAE VKRLFR P IEELKKDFDELNW 
IETDMQIMVRLINKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 
LLS FGGLQ W I NGLN S TE PL VKE YAAFVLG AAFS S N P KVQ VEAI 
BGGALQKLL VI LATEQPLTAKKKVL FALCS LLRHF P YAQRQFL K 
LGGLQ VLRTLVQEKGTE VIiAVR VVTLLYDLVTEKMFAE E EAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQTLGVLLTTCKDRYRQDPQLGRTLASLQAEYQVLASLELQEX3E 
DEGYFQELLGSVNSLLKELR 


5649 
5650' 


1172 


3006 


KI^EQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEALNL 
KQLRKRGS I PTSLTDLS LAS AS PPLS GRSTP KLTSRS AAQDLDR 
MGVMTLPSDLRKHRRKLLS PVSREENREDKATI KCETSP PSSPR 
TLRLEKIiGHPALSQEEGKSAIjED0X5SNPSSSNSS0DSLHKGAKR 
r^x»u>s1GRIjFGKKEKGkLIQLSRDGATGHVLLTDSEFSMQEPM 

vpaklgtqae kdrrlkkkhqlledarrkgm p faqwdgptws wl 

ELWVGMP AWYVAACRANVKSGAI M SAL S DTE 1 QRE I G I SNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFME CLV 
DARMLDHLTKKDLRVHLKMVDS FHRTSLQYG I MCLKRLNYDRKE 

lekrreesqheikdvlvwtndqwhwvqsiglrdyagnlhesgv 
h3allaldenfdhntlali lq i ptqntqarqvm ere fnnllalg 
tdrklddgddkvfrrapswrkrfrprehhgrggmlsasabtlpa 

GFRVSTLGTIiQP PPAPPKKIMPEAHSH YLYGHMLS AFRD 




1172 


3006 


MLQFQLDA I NEE^ I RM I QEEKESTELRAEBIETR VTSGSME ALNL 
KQLRKRGS I PTS LTDLS LASAS PPLSGRS TPKLTS RS AAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQEEOKSALBDQGSNPSSSNSS0DSLHKGAKR 
Wil K3b x GRLFGKKEKGRLI QLS RDGATGHVLLTDS EFSMQEPM 
V?AKLGTQAEKDRRLKKKHQLLEDARRKGWPFAQWDGPTArVSML 
ELWVGM PAWYVAACRANVKSGAI MSALSDTEI QRE I GI SNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGS WAQTLA YGDMNHE WIGNEWLP S LGLPQYRS YFME CLV 
DARMLDHLTKKDLRVHLKMVDS FHRTS LQ YGlMCLKRliNYDRKE 
LE XRREESQHE I KDVLVWTNDQ WHWVQS I GLRD YAGNLHESG V 
HGALLALDENFDHNTLALI LQI P TQNTQARQ VMEREFNNLLALG 
TDRKLDDGDDKVFRRAPS WR KRFR PREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


| 5651 


646 


1869 


AROGQRQ P Wn » PAH A If rib a! <6 tr c dp Vr 1 c7^"er't.Tp/>tV»v r> n A mr>^ nM T 
™^vvw*syr no aMru-iiUjlr'*Holloi'KV *cajo(jWEGPASP*TPGSTL 

AWGEGAGI R* ASGLTAAGAASAAAA/PPPTRGG PAPAGCGRAPP 
WPAPLR VPTHGRAPAPRS RAAPRAPALS HOTAAAALS PAS PAGP 
ADP* LPGHS SOS P PRO * RWQR53 P Q A D A D atiD vu d» d» c e n o » m 

qtpgwpgscclaqgwqaeplgapgaedgNpvppqrgfplgtlgs 

PAGS WAGLAG YG * AGAPGTQATAPRAAGQT P VAAAPNCRV*GSA 
PALHRAPAAADPGSPLQAP PRAWAS paaag pgls ssdycgglga 
gwragispellgaaglsdnwarcpgpgpab*ggqpgcrtipasa 

CMPSPPVEGSLG1iSRKG1IGDLPSQ^*GWHECRRARHLVPLPRL 

lgprgrtgrpssps 


5652 


735 


" 343 


hhkkyqhihqksfscpepacgksfnfkkhlkehmklhsdtrdyi 
cefca^sfrtssnlvihrrihtgekplqceicgftcrqkaslnw 
hqrkhaetvaalrfpcefcgkrfekpdsvaahrskshpallla 


5653 


66 


1401 


rgriiqsrgrltlglvlllldilgarohgqrvshgwkggfltapl 
ctpqpc^pgtrrgrrrslkeatepqi^amaeefvtlkdvgmdftl 
gdweqlgleqgdtfwdtaldncqdlflldpprpnltshpdgsed 
lbplaggs peats pd vtetknsplmedffeegfsqei /srd viq 
swllblqfrrs lyrghlvr ♦ farrs rks sev * ychqrgkshgmq 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D«=Aspartic Acid. E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N«=Asparagine, 
P=Proline, 0=Glutamine, JUArginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y= Tyro sine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








ES * I K3SRTQ3 CVHR FHGRRFHG \ DNVS K KTLTPAKlS'RB YRGEF F 

S YSDHSQQDS VQEGEKP YQCSE CGKS FSGS YRLTQHW I THTRE K 

PTVHQECEQGPDRKASHSGYPKTHTGYKPYVCMEYGTPFSQSTY 

LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 

ECGKAFTRIFHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5654 
5655" 


3 


598 


TLPLFPGRK^RGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAT " 
NWKPFVYGGLASITABCGTFPIDLTKTRFQIQGQTNDAKFKEII 
YRGMLHALVRIGREEGLKALYSG * VGLHAFLCHCSLFRMG IDFR 
PRLHRSQVKSLRCV*KEQIA**/MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNKKICLFKNI 




2 


867 


RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMI PFKDEGDPQ\REKIFAE I VNPEEBGDLADI KSSLVNES 
EIIPASNGHBVARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PSYSSYSGYIMMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHPLTPLI TYS DEHFS PGSHPSHI PSDVNS KQGMS RH P PAPDI 
PTFYPLSPGGGGQ ITPPLGWQGQP 


5656 


228 


1066 


PRRVPPLPEFASaPGAAFFHSGRLQRSLTKDSAGCFSQCRSRAM" 

LVIJISGIjTKAI^RTIiAPQVCSSFATCPROYDGTFYEFRTYYLK 

PSNMNAFMENLKKNIHLRTSYSELVGFWSVEFGGRTNKVFHIWK 

YDNFPHRAEVRKALANCKEWQEQSIIPNIiARIDKQETEITYLIP 

WSKLQKPPKEGVYEtiAVFQMKPGGPALWGDAFERAINAHVNLGY 

TKWG VFHTB YGELNRVHVLWWNES ADS RAAVRHXSKEDP I S WG 

GVHBSVNYLWSQQNM 


5657 


105 " " 


1052 


GQRLQS PRVQMP VQ PPS KDTEEMEAEGDSAAEMNGE EEES EEER 
SGSQTESEEESSEMDDBDYERRRSECVSEMLDLBKQFSELKEXL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECEIiQGAKQHLESEKLLLYDTLQGELQBRIQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRXKAPLVSG 
P Y I VYMLQEID I LEDWTAI XKARAAVS PQKRKS D\DLDPAVHS Q 
GDPQS S WHCTQDSRLPPADRRTHRPLR VCPARLLWCCWAL PLH L. 
ALVWTPPL 


565S 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNS I KTLVDNI QRYVEDGK 
NQLLLALLKCTDTEXjQLRRDA I FCQAI»VAAVCTFSEQLLAAIiG Y 
RYNNNGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMLED I WVTLS ELDNVTFS FKQLDEN YVANTNVFYH IEGSRQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAE DLQQD INAQSLE KVQQYYRKLRAFYLERSNLPTDAS T 
TAVKIDQL I RP INALDELCRLMKSFVHP KPGAAGS VGAGL I P I S 
SELCYRLGACQMVMCGTGMQRSTIiSVSLEQAAILARSHGLLPKC 

IMQATDIMRKjQGPRVEILAKNLR\naDQMPQGAPRLYRLCQPKMN 
GDL 


5659 


2 


696 


WKRSGE VS P KGELGAWRGNS GR P KI IGRAAEAENEDRTLGRH* P 
GNBRSQPRSPLRLIAPQLKAEAAADKGIjAPVPPPFSSGHSGPC\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRMVQKESQATLEERBSELSSNPAASAGASLEPPAAPAPG 
EDNPAGAGG\AAVAGAAGGARRFIiCGWEGFYGRPWVMEQRKEL 
FRRLQKWELNTYIi 


5660 


229 


853 


PVTMWAFS ELPMPLL INL IVSLLGFVATVTLIPAFRGHFI AARL — 
UbUUUNKi iKUQlPESQvjVISGAVFLI ILFCFIPFPFLNCFVKE 
QRKAFPHHE FVALI GALLAI CCMI FLGFADDVLNLRWRHKLLLP 
TAASLPLLMVYFTNFGNTTI VVPKPFRP ILGLHLDLGR* S YHCC 
P YGT YFRE PFLVIiHI LLQVFLFCI*CVFPDP FW 


5661 " 


2 


473 ™ " 


LNLYPSPCGGIPKLPGIiPREAAAALGASFLAEAPLPVTVRGSGL 
AGMAVTCDP:<AFLSICFVTLVFLQLPLASICQ2J*GTDSCASRGK 
ADFDVTGPHAPIIiAMAGGHVELQCQLFPNISAEDMELRWYRCQP 
S LAVHMHE RGMDMDGEQKWQ YRGRT 


5662 


2 


i3ia ■ 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
P FPKHKPS AKLS VRDAXiGAQNASGERX XI QGW I RSVR5QKE VLF 
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Amino acid segment containing signal peptide 
<A=AIanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, FaPhenylalanine, G«Glycine. 
H*Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M«Methionine, N^Asparagine , 
P= Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V-Valine, 
WoTryptophan, Y-Tyrosine, X«UnJcnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LHVNDGSS LES IjQ WADS GLDSRELTFGS S VE VQGQLI KS PS KR 
QNVELKAE KI KVIGNCDAKDFPI KYKERHPLE YLRQYPHFRCRT 
NVLGSILRIRSEATAAIHSFFKDSGFVHIHTPIITSNDSEGAGE 
LFQLE P SGKLKVPEENPFNVPAPLTVSGQLHLE VMSGAFTQVFT 
FGPTFRAENSQSRRHLAE FYMIEAE IS F VDSLQDLMQV IEELFK 
ATTMriVLSKCPEDVELCHKFIAPGQKDRL*HMLKNNFLIISYTE 
AVE I LKQAS QNFTFTPEWGADLRTEHE KYLVKHCGNI P VFVINY 
PLTLKPFYMRDNEDGPQELEGSVA*HSriGLMILLSIWIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASG'CPGAERSLt 
VQSYFEKGPLTFRDVAIEFSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLG IALTKPDLITCLEQGKBPWNTIKRHEMYAKPPVI CSIIFP 
QDLWAEQDIKDSFQEAILKKYGKYGHANFQLQKGCKSVDECICVH 
KEHDNKLNQCLI PKKKK 


5664 


113 


572 


SLS MES NHKSGDGLSGTQKEAALRALVQRTGYS LVQENGQRKYG " 
GPPPGWDAAPPERGCElFIGKtiPRDLFEDELIPLCEKIGKIYEM 
RMMMDFNGNNRGYAFVTFSNKVEAKNAI KQLNNYEIRNGRLLGV 
CASVDNCRLFVGGIPKTKK 




347 


702 


WQHLI ILLHCERTSPAMITSELPVLQDSTNETTAHSDAGSELE " 

ETEVKGXRKRGRPGRPPSTNKKPRKSPGBK5RIEAGIRGAGRGR 

ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666 


213 


540 


VSCLPTS CKM I TLNNQDQPVP FNS SHPDE YK I AALVFYS CIFli " 
GLFVNITALWVFSCTTKKRTTVTIYMMNVALVDLIFIMTLPFRM 
FYYAKDEWPFGEYFCQI LGA 


5^7 


1 


695 


HPLPSASLGLPSVSLGVSLCVRSALLEAWPMLPKRRRARVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFIiTGLARSKGFR 
VLDACSSEATHWMEETSAEEAVSWQERRMAAAPPGCTPPALIjD 

i s wltes iigagq p vpve crhrlevag ps kg p hs pawmpayacqr 
ptplthhntglsealeilaeaagfbgsegrlltfcraasvlkal 

PSPVTTLSQLQ 


5668 


691 


894 


CSFtFClPDLF^tLGRKEEEAVLVGGEWSPSLDGLtiPQADPO 
VLVRTAI RCAQAQTGI DLSGCTKW 


5669 


407 


1 


DSGAP EGLSFLMSTQEGLS MHAHPQAYTPF I YLHARKRRGEIGD ~ 
ADSRFNDR YAH KSAQL Y FL YFVCW I FQD VY Y FTI KEKNHFFFPK 
ARGAPTKYSGS P IGS PTTTPPTRP PS FNLHPAPHLLASMQLQKL 
NSQ 


5670 


3 


373 


SSECLTMAWIPLLLPLL I LCTVS VAS YEIAQPSS V£ VSP^QTAK 
ITCSGDVLJUCKYARWFQQKPGQAPVLVIYKDTERPSGIPERFSG 
STSGTTVTLTISGAQVEDEADYFCYSATDNFLWVF 


5671 


280 


524 


KFPPKKTP PHLGMESAITLWQFLLQLLLDQKHEHIi I CWTSNDGE "~ 
FIGjLKAKKVAI^WGLRKNKTNMJIYDKIjSRALRIaLFMT 


5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSIjYIRIVEGKNLPAKDITGS"" 

SDPYCIVKVDNEPIIRTATVWKTl^iCPFWGEEYQVHLPPTFHAVA 

FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 

WPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 

QAWLLLPLP 


" 5673 


327 


696 


ITVADQI SHWSAGRI KNRTRI PECIHSSAATTLAGPHTMEGESV " 

KLSSQTIjIQAG©DEKNQRTITVNPAHMGKAFKVMI?ELRSKQLLC 

DVMIVAEDVEIEAHRWLAACSPYFCAMFTGDMS 


$574 


17 


984 


GGGSME^GESTSAVL^GFVtX^AIAFQHLNTDSDTEGFLIOBVKGE 
AKNSITDSQMDDVEWYTIDIQKYIPCYQLFSFYNSSGEVNEQA 
LKKILSNVKKNWG^KFRRHSDQI>rrFRERLLHKNLQEHFSNQ 
DLVFLLLTP S I ITES CSTHRLEHSLYKPQKGLFHRVPLWANLG 
MSEQLGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLXEVHKIK 
EMYASLQEELKS I CKKVEDS E QAVDKL VKDVNR L KR E I E KRRG A 
QIQAAREKNIQKDPQENI FLCQALRT FFPNS EFLHS CVMSLKID 
MFLKVAVTTTTISM 


5675 " 


80 


753 


EGSRRG PTRLARLS ARAGRLHFP PGFS SRLIHFRGV3 ECRRP PG 
KS G VP VS APGS DG KWWE ERPGMFS LMAS CCGWFKRWREP VRKVT 
LLMVGLDNAGKTATAKG IQGE YPEDVAPTVG FS KlfnjRCGKFE V 
TI FDLGGGIRIRGI WKNYYABSYGVIFWDSSDEERMEETKEAM 
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Ami.no acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P=Proline, Q=Olutamine, R-Arginine, 
S=Serine, T«Threonine, V«Valine, 
W-Tryptophan, Y^Tyrosine, X=UnJcnown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S EM LRH PR I SGKP I LVLANKQD KEGALGEAbviECLSLE KLVNE 
HKCZi 


5*7* 


2 


930 


F VS S PPPRP VQPARPGG FGLSGRRS LLCQVASTPAHVGVMRS P V 
RDLARNDGEESTDRTPIiLPGAPRAEAAPVCCSARYNIAIIjAFFG 
FFI VY ALR VNLS VALVDMVDSNTTLEDNRTS XACPEHS API KVH 
HNQTGKKYQWDAETQGt^ILGSFFYGYIITQIPGGYVASKIGGKM 
LLG FG I LGTAVLTLFTP I AADLGVG PL I VLRALEGLG E G VTPPA 
MHAMWSS WAP PLERS KLLS I S YAGAQLGTVI SLPLSGI I CY YMN 

WTYVFYFFGTIGIFWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHLVAAARAA 
VTAETHPLPLLAPLAVCQSVKSPAACQVRPRPRAVALPAALGGP 
GRSL PGLTAATMSS FSES ALEKKLSE LSNSQQS VQTLS L WL IHH 
RKHAGPIVSVWHRELRKAKSNRKLTFLYLANDVIQNSKPJCGPEF 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSPPPKATEEKKSLKRTFQQIQEEBDDDYPGSY • 
£ PQDPS AG PLLTEELI KALQDLENAASGDATVRQKI AS LPQEVQ 
DVSLLEKITDK3AAERLSKTVDEACLRNRGPGTS j 


5678 


3 


593 


SSSPPSSTPSLPLPFY LLLGQLRLQLLWGTAHLSGAG EAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SLAEFTEQFWQIjHNRRNENIjQLGPLGRDPPQECSTFS ptdsgeb 

pgqls pgvqfqrrqnqrrfsmbvrasgalprqvagcthkgvhrr 
aaalqpdfdvskrlslpmdi ! 


5679 


2 


423 


lnsrvddfvavpgaimdedyygsaaewgdeadggqqeddsgege ■'" 
ddaevqqeclhkfstrdyimepsifntlkryfqaggspenviql 
lsenytavaqtvnliaewliqtgvepvqvqetvenhlkslli kh 
fdprkads 1 fteegetpawleqmi ahttwrdlfyiclaeahpdcl 

MLNFTVKVGRVLELRRKVTMNVYFWLIjVCFL 


5680 


258 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMFtlkKKKKIE 
ISGPSNFEHRVBTGFDPQEQKFTGLPQQWHSLI.ADTANRPKPMV 
DPS C I TPIQLAPMKrrVRGNKPC 


56B1 


45 


869 


LLCAKTLGVRTKESQABG YNRSG INNHQAEDPRFCPSFCWMRSA " 

RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 

FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 

PASCVFSQVMNMAAFLALVVAVIAFIQIjK^ 

LCLAS FGMTLLGNFQLTNDEE IHNVGTSLT FGFGTLTC W IQAAIi 

TLKVNIKNEGRRVGIPRVI LSASITLCVGPLLHPHGPKHPHVCS 

QGPVGPGHVL 




39 — 


622 


PSRS CLGTMRKWRHREVNLPEVTQQDAVCPAPI PS PGLSAQTGL" ' 
QKIWGTIHCQVCTOAPAWPGSPWHEE^LLIjLVPLIjLLPGSYGL 
P F YNG FY YSN S ANDQNLGNGHGKDLLNG VKL WET PE ETL FT YQ 
GASVILPCRYRYEPALVSPRRVRVKWWKLSENGAPEKDVLVAIG 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GSCGATALITRCliAWSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTD WHR YNLRRKVASMAP VTAEG FQERVRAQRAVAEEES KGS 
ATYCTVCS KKFAS FNAYENHLKSRRH VELEKKAVQAVNRKVEI4M 
NEKNLE KG LGVDS VDKDAMNAAI QQ A I KAQP S MS P KKAP P APAK 

EARNWAVGTGGRGTHDRDPSEKPPRLQWFEQQAKK1AKHSEDD 
SEDEEHDLC 


5684 


195 


677 


T^WC^FRGYl/jPfe V lMl<ALb^PP YLTV6'l'b ^SAXYRGAFCEAK IKT " 
AKRL VKVKVT FRHDSSTVE VQDDHI KG P L KVGAI VEVKNLDGAY 

QSAVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQIiPLTNPEHFGTPVIGKKTNRGRRYE 


5685 


779 


1262 


IiIiQQPVVHCFLLFPPFRFSHHMIPGPPGPHTTGIPHpAlV1 r P"Q - 
VKQEHPHTDSDI^HVKPQHEQPJCEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTL^SAAINQILGRRWHALSREEQAKYYELARKE 
RQLHMQLYPGWS ARDNYVS PS S I PVALHS 


5686 


128 


1181 


CTWWQVNITLLDINDNHPTWKDAPYYINLVErrrPPDSDVTTVVA 
VDPDLGENGTLVYS IQPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNLLDLNDNUPTF | 
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location 
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amino acid 
sequence 


Ammo acid segment containing signal peptide" 
<A=Alanine, C=Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lyeine, 
L»Leucine, M*=Methionine, N-Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S -Serine, T=Threonine, VoValine, 
WaTryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QN L P FVAfi V1»EG I PAG VS I YQWAI DtiDEGLNGLVS YRMP VGMP 
RMDFLINSSSGVWTTTELDRBRIAEYQLRWASDAGTPTKSST 
S TLT IHVLDVNDETPT FFPAVYNVS VSEDVPR\GSGWSG * AARN 
NDVGLNAELSYFITGGITVDGKPSVGYRDAWRTWGLDRETTAA 
YML1 LEAIDNGPVGKRHTGTAT VFVTVLDVNDKRPI ILQSSYV 


5687 


17 


917 


AAPPAPPDG> PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA " " 

QGDGGAAAVGHVIiWPAVGPVR VNPGLQTP VPRPELLPG P \S SS 

LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 

SGCRMPSTSASB/AAGGQGACTHAKGSETPPPASPQTSEPAPSP 

liPPHLTGGPGMYSSEAKLPNSFSCLGLAGTGAGl *GTASAHGTG 

PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 

^PFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 


5688 


1 


420 


LTKWDLFGINUYRLIJCTGIEHGAMPEQVGVYWYS/CLYDSRKLFF 
♦SHMI IRSLL*KVIDDSLGQLPLLRELLL* *LNVIDRCI ILAYV 

LRVEKTFAITYLKNFTVKVDFSLLGEIPLISMAAILKLWIMKID 
DGYIPAVF 


5689 


Isol 


3 


HELSGKHISMVSGNTCNWHPGGHSPGGGGQGBITSKDRGBIPAL " 

IWA/RK?IGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 

GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 

GSTARKPAPATPGTRHPRTMETRBVAQGWPAGPRSQFWDQHPHS 

PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKO 

KLPRTREPPLLQAGWAVRKPPWSEAKEGLGOAGRPSGMDSSAS\ 

PQTPGGRGSLEWGLPLYLGPHHDVK*RSDRLG* PP * GGQGGGGH 

GAPSTPGPGGEAW*tiPQQTSRPKFGPQAY*GE\GSPGLQCPCSK 

EL*RVPPGSLGPSTQCKYEPrDKHS\CGAiDAQLEVSTAGSRSTF 

GQELKGPLDAGRiWPGAPSASSSHR*GG*ERARAGAGHRGST*A 

SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 

GS PP PA* AS AG RKGTVSTLGGGLL 


5690 


1424 


58 


PSPPAGVCAAFAPLPUAlJu^RRPCSP 

gawrts vsalrrgatg/apcs pgaeaapwqtggpaidg\dgelp 
*vrseeaprgcgaegggpgsgpvrrpgagrgahagqgrqqdpep 
dglrhrqhgaasharhrliqrijrpghhqnrhvrrdpqappggpap 
ghaaalpertrgvae ppawahagsdawragr* sqrt * erarprh 

PTFQGRAGS \GQPGYQ PPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRS DRN P SQGLRTRIRR PETPDCGP PSPAGSS ASAST FRCTS 
SLSLLG P/ PGAHNLDTAPQDR * HGP* GDKRGAPG VAGEDPR PP * 
GNFVR * LLLNP/ GVA* RHGTS PFLGPSLGENGGQWDS GNLFGTP 
KG * SHPAFTKST * SMEAEKS YWNHPHR \DRGRQG VR INCLRVGE 

SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNS PGLLP 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSG^LSFikD" 
VAHRMLATGE CTPEDLCFSLQVMQ * KTGTES WG*RFY I VBQN* S 

GDAPLIFSPYLSLTGNCGFAMLVEITERAMAH\CGSPGGPSLWG 
GVGVYVLLESVPLSYS 


5*92 


1193 


548 ■ * 


tqawtraekdrkgsvrai,rlhlergppt*rgshplVqsvpciqk 

PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPL 
TSRSVPPGRGALPPDSLSTRKGIiPRPSTAGHRVRESGHKVPVSQ 
RLNLPVMGATRSNLQPPRKVAVPGPTR*RDQDSKQDFSSKPLQS 
VPGLASTQQT LTPADS G PGTGGRDATRAGLPGVBTMGNG VD 


5693 


1258 


liid 


Alii WfVKAU I lWWAQPHGCSNLVbKAKliUIiSSRPSQNTEPOAP 
*QAGPPSSLRPP\SRRR*APEWPKRArGSRCRGLSAPPWPWPAA 
RGB/PGSAPSHAP/PNS?RPSGTRHP/PGPSSRVI*YSPSLPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVfLSTAGALG*KQLHQ*Tfl'T 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSS KGGELKKPIS LGHPGSLKKGKTPPVAV7S PITHTAQSAL 
KVAGKPEGKATDKGKIiAVKNTOLQRSSSDAGRDRLSDAKKPPSG 
IARPS TSGS FG YKKP PPATGTATVMQTGGS ATLS KIQKSSG I P V 
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| Amino acid segment containing signal peptide - 
(A«Alanine, C»Cysteine, D=Aspartic Acid, Be 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V»Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, * e stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








KPVNGRKTSliDVSNSAEPGFLAPGARSNIQ^tpRPAKSSSMS 
VTGGRGGPRP VSSS I DPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTDREKEKAKAKAVALDSDNISLKSIGS PBSTPKNQASH 
PTATKLAELPPTPLRATAKS F VKP PSLANLDKVNSNS LDLPS S S 
DTTQCI 


5695 


I 3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 
QRCL\NNLSSEEFHASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGE LKKPI S LGHPGSLKKGKTP PVAVTS P I THTAQSAL 
KVAGKPEGKATDKGKIAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
I ARPSTSG S FG YKKP PPATGTATVMQTGGSATLSKIQKSSG I P V 
KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLIiSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISIiKSIGSPESTPKNQASH 

PTATKLAELPPTPLRATAKS FVKPPSIiANLDKVNSNSLDLPSSS 
DTTQCI 


5696 


3 


1338 


65 KE PARSLHRRGSGHlk^SAGKWGS VTLSTAGALG* KQLHQ » WT — 
QRCL \NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESG LS WFS ES EEKAP KKLE YDSGS LKMB PGTS KWRR ERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGI^RSSSDAGRDRIiSnAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KP VWGRKTS LUVSNSAE PG FLAPGARSNIQ YRS LPRPAKS S SMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKS IGS PBSTPKNQASH 

PTATKIAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALS PPACPSAPAPRRS I ISRLFGTSPATEAAPPPPEP VPAA 
QG PATVQS VBDFVPDDRLDRS FLEDTTPARDE KKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEEEAEVAAP7KGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAP PWPGG VS VRTGP EKRSSTRP PAEMEPGKGEQASSS ES DP 
EGP I AAQMLS FVMDDPD FESEGSDTQRRADDFPVPJDDPSDVTDE 
DEGPAE PPPP p KLPLPAFRLKNDSDLFGLGLEEAGPKESS EEGK 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5696 


2 


666 


gaeaaepqeulpplsqssrffqeqqkmnkslgpvsfkdvaVdft - 

QEE WQQLD PEQ KI T YRDVMLENYSNLVS VGYH I IKPDVISKLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPSRQTVFI 

etli*r/ergnvpgntfdvetnpvpsrkiaythslcnscer\gp 
nassbyissdgryarmkadecsgcgksllhiklbkthpgdqave 

FNQ 


5699 


2 


1448 

• 


rvrqppglwvrrtvpamqcpaglsrvpgvag/dpslpsfrgprd 
eaahrgtiqtarhtrklyvqgpasgpplprvstqvai *dekpla 
rps/grtnapfpqgqkpagkaapgpaaagrvamr\pghpgllas 
dsqrssskgsgwetpvpws*aqpgwvsgllllgdpsgpgsl*rs 
twlvggargpegsgvrgsgwpsgcsdig walagwnhs *hldpnt 

WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VPI LFQNPSGALRSRRTEPAG WVP PTRHE * DDG * TAAP AS GGA P 
VS TPT WAGTP/ LNASLGPTDPQGK PGCRP PCALPKPAG PE RSA* 

ggslgcr/ smlpassgpppapgprrlaagautsasarcppaaaa 
g wq prrpg fagraal pgpphp ps s * relgglpgpg w * tldplpa 

HPAHPPGSAPPWGALGGWAAARASLPWSPSLCLSFPAVTPVAGL 
FPPGRG 


5700 




597 


NGHKGVWEINIY*RRSNIHXNSKSfiSHLNQDHSFPPPTPNSARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E* CSIASSLI KAILRVS VLSB 


5701 


59 


410 


IFEKICSDTQEFISPEINPQICSWLYFDKGAK/'NHATGKDSLFN 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C«Cysteine, D-Asparzic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P*Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V^Valine, 
WeTryptophan, Y«Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 


5702 


. 3 


I 1517 

♦ 


ETFVDPSQCGGIPSDSPHPVITPSRASBSSASSDGPHPVITPSR" 

ASESSASSIX3PHWITPSRASESSASSDGLHPVITPSRASESSA 

SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 

PVITPSRASESSASSDGPHPVITPSWSPGSDVTLLAEALVTVTN 

I E VINCS ITE I ETTTSS I PGASDTDLI PTEGVKASSTSDPPALP 

DSTEAKPHITEVTASAETLSTAGTTESAAPHATVGTPLPTNSAT 

EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKV5GAA 

PVSIEAGSAVGKTTSPAGSSASSYSPSEAAIjKNFTPSETLTMDI 

TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 

PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 

KG * CSSSTGNS TPTRLTSRS P YCVSGEANG / PSAAARHVP YAKR 

GCCP*PGPPPTDCSCVTVLRGTQKVPMKGSMSKPLTPDVATGPS 

LTSTGVYVWGGASPVPRGVLGLTLAHVLCFSKEKT 


5703 


14 


1117 


hhkdsrsqglprtqegarpelrpllcpralwpvtrlsyrcpwqa 
pkagigtkakpseshlklhpgwpsldrqgepatlgtgtghcsds 
rilrwhp*htaar*prwrrlpsshrwtrhlgvlrvqdks**vsl 
dpscrprflrtc**ygmrsvasssnpppgwsgpgasvfparpvs 

ALPTGPRCW*APRGRTRQPCX3WPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS * WCPWL*AARWTGWRTASGASAGIX5RAADRPSAWA 

rrvagll pgqgltvrr * h* tagapas vrss qgatrspapggdq c 
acgrgpgsc*hpppwpvspsspvpcpsgr*hlrgpllsaarpra 

AGWPRHSPHDTQTPEP 


5704 


23 


562 ~ 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEBAISHEWI 

sgnaasdkni kix3vcaqieknfarakwkkavrvttlmkrlrapr 
qsstaaaqsasatdtatpgaaggataaaasgatsapegdaaraa 
ksdnvaprrp*lppqpqmevppqplmavspqppmeaslqplmge 

SPQP 


5705 


23 


56l2 


GDYEFDSP YWDDI SQAAKDLVTRLMEVEQDQRITAEEAI SHEWI 
SGNAAS DKN I KDGVCAQ I E KNFARAKW KKAVR VTTLM KR LRAPE 
QS S TAAAQSAS ATDTATPGAAGGATAAAAS GATS APEGDAARAA 
KSDNVAPRRP * LP PQPQMEVP PQ PLMAV5PQ PPMEASliQ PLMGE 
SPQP 


5706 


1161 


610 


qlgrfxaqdtvairkVkevfgtgaMrhWi:i,fthked*ggqald 

DYVANTDNCS LKDLVRECERRYCAFNNWGS VE EQRQQQAELLAV 
IERIiGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQ VEKHKQELRENESNWAYKALLRVKHLMLLHYE I FVFLLLCS I 
IiFFIIFLF 


5707 
"S708 


28 


609 


GSPAPTPGFRRRPGRGTFSPGTRHHQGRAEPEPDAPERAPLRR* 
MFAIQPGLAEGGQFLGDPPPGLCQPELQPDSNSNFMASAKDANE 
NWHGM PGRVBPI LRRSS SES PSDNQAFQAPGS PEEGVRS PPEGA 
EI PGAE PEKMGGAGTVCS PI*BDNG YAS S SLS I DSRS SSPEPACG 
TPRGPGPPDPLLPSVAQA 




44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWP^PRPYLDLPA 
QAS VSRPHDRA* GEAVS LS LSSGD VCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPS PRTKQKVRNKVCLL IAIR YSDI PS DVS KAP \GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVS PASGG PRKEGRQGSGG + AGGGGP \ ARTHADL PCVG F VCS PP 
LLK* SDS P VKQI»PA\SGQGS GAGM PP VGS SDILR PRPTS VSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
S RRRRGP * AAGRS TPAVP * PCS *GGAGRRAYACRTG WGYAPSR* 
u&trouir i ovJa/u>*TrtASnbTGA* *SRLCGTAGTGPLCSQSSRS * 
AG * R CCCTAAS P CGGSG P S HPGS PS AHCLSWSGGRTQ PRAPS AH 
GRGRAMGSRCVCTCTGLPCPGIPLSGASPGGSGBTGAGRSHTLK 
AARSRLSPRPGSGSRGSY*SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 

PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


S709 


2 


2031 


ITLCPLPQTEKCJLNVVTEAATPLGlYLKARVEAGGLKEIjEISWG 
LHQIWRWGAWMRAGMGGCRCWGVMAPFAPR/NALS FLVNDCS 
LIHIWVCMAAVFVDRAGEWKLGGLDYMYSAQGNGGGPPRKGIPE 
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ID 
NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCysteine, D^Aspartic Acid, E= 
Glutamic Acid, Fn Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M«Methionir.e, NuAsparagine , 
P*Proline, Q^Glutamine, RaArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEQ YDPPELADS SGRWRE KRS ADMWRLGCL I WE VFNGP L PRAA 
ALRNPG Kl P KTL VPHYCELVGAN P K V RPN P ARFLQNCRAPGGFM 
SNRFVETNLFLEEIQIKEPAEKQKFFQELSKSLDAFPBDFCRHK 
VLPQLLTAPEFGNAGAWLTPLFKVGKFLSAEEYQQKI I PVWK 
MF S STD RAMR I RLLQQMEQFIQYLDEPTVNTQ I FPHWHGFLDT 
NPAIREQTVKSMLLLAPKLNEANLNVELMKHFARLQAKDEQGPI 
RCNTTVCLGKIGSYIiSAS TRHRVLTS AFSRATR0PFAPS RVAG V 
LGFAATHNLYSMNDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS. 
FLS KLES VS ED PTQLEE VE KDVHAASS PGMGGAAAS WAGWAVTG 
VSSIiTSKLIRSHPTTAPTETWIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWS TGGQVSRASQVS \TPTTNPPNPQS PTGAAGK\RGLIiGTGLA 
GAKLPGATS * R YTAGQRV 


! 5710 


1 


562 


IPGSTISCEVELMARMAKTIDSFTQNQTRLWI IDGIiDACEQDK 
VLQMLDTVRVLFSKGPFIAI FASDPHI I IKAINQNLN5VPSGFK 
\LNGHD YMRNI VHLPVFLNSRGL/RQ/LQENFS * LQQQMBTFHA 
QILQGYRKKLTEEFHRTALGR*QNLVARQPSIDG*DAIGPELYV 
CIAI QFNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQ I AKAVLSQQRPS LFHECAFHF FS * SLQRHT INLDQG I F* LLM 
LSEBRQHLFESS / 1 WTTPHNLK* / FE IHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSLD I S ERLKFLLTLDCVDDTL I VLAEEHGCLDI t KELP" 
ETVIDI.T.NKCLTFHPSKRPTPDELMKDKVFSEVSPIiYTPFTKPA 
SLFSSSLRCADLTLPEDISQLCKDINNDYLAERSIEEVYYLWCL 
AGGDLEKELVNKE I IRSKPPICTLPNFLFEDGESFGQGRDRSS / 
TFR * YHWD I WMPAKK* I ERCWGRS I LP I TLKMTS L I LPYSNSN 
NELS AAATLPL 1 1 REKDTE YQLNR 1 1 LFDRLLKAYP YKKNQ I WK 
EARVD I PPLMRGLTWAALLGVEGAI HAKYDAIDKDTP I PTDRQ I 
E VDI PRCHQYDELLSSPEGHAXFRRVLKAWWSHPDLVYWQGLD 
SLCAPPLYLNFNNEALVYACMSAFI P KYLYNFFLKDNSHVI QEY 
LTVFSQMIAFHDPELSNHLNE IGFI PDLYAI PWFLTMFTHVFPL 
HKIFHLW\DTLLLGEFLFPILYWE 


5713 


€34 


284 


PVCAVPVDRWPVLPREDQEGQQL*AKLPRDFRR*FQILGPMEGH 
TACRCS RRG AQVQHLPRE D I RAAE * DPHLREVWPGLPT5S ATS P 
* RAVLTSPCSHLGSADAASSHWLCGVSFH 


5714 


212 


613 


WGLGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KS AVVAAAAPAS VADDTP P P ERRNKS G I ISEPLNKSLRRSRPLS 

HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPAS KLQGGGGGLQTGWGLHP VPVTAAS PLPRWCLFGAVAK\ 
GLPGP * LCPSGAA/GGLQRGPGLS PLGAAGKVSCLHPPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS 
PRPPGS/ SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEBKTFMSS 
QIRRKETKP L*RKTPAG\NNYQSNSI PVSQSPQLTVDLLPSAGR 
TQAPSGRGDAGKPTPGHG\LPKAS VILTPMCPCSLAGGQ* PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC*EV\GALGEPVR1PG 
L* PDLSC I LSNOS KHRREGLS FPRSLG PGRRG PAGLQSLGCSPT 
PKNTACHSSGHVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 
VWRWI PLE * LGLSRETGQATRRGLVWIS PGRAAAACVACAQALE 

egplrlpgqdrgaqpcshcpgraagqpepgagapcre/gg * DPT 
glt/gvpgtdpkrggrkpgqsgqetqgptvwsgpesplqpkp*e 
rqe/vgagassgvglsrgraggpssawbvaamllllrhgshsel 
tdlteaqtsqh 


5716 ; 


1711 


1370 


rvfsllcegpghcyqgavcreacaaaspgldsaaephrlcehtd 
+lpk*gpgyiqhfhcdsnilcilykisfnlfsysf*gvaryac* 
rcplvl * sg ffti i vgg ysccmplkt 


5717 


44 


1489 


lpteai^eseWVseygkcgprSlVpege;Stsplp^vdted6ld 
egpgalvlesdlllgqdlefeeeeeeeegdgnsdqlmgferdse 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, B« 
Glutamic Acid, Phenylalanine, G»Glycine, 
H^Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
SoSerinc, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *=stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








gdslgarpg^pxgUddssgggralsabskveepargpgsarge 

RPGPACQLCGGPTGEGPCOGAGGPGGGPLLPPRLLYSCRLCTFV 
SHYSSHLKRHMO/THSGEKPPliCXSRCPYASAQLVNLTRHTRTHTG 
EKPVRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPEDALIjLPDLSmvPPGGASFLPDCG 
Q\CGVKGRASAGLDQNHCQS /SLFPWTCRGOGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGBKPYKCPL 
CPYACXSNLANLKRKGRIHSGDXPFRCSLCWYSCNQSMNLIRHM 


5718 


120 


284 


VAHALSLPAES YGNDVS MTHPQL PPTQLAWDLCRTCLPLS YNFT 
S**STADPLHL 


5719 


48 


428 


ELNNGPFQm^i^ggNLAVTGSWADRSPLHIWAS^ 

L S QG YNVNAVTLDH VT PLHE ACLGDHVACAR7LL EAGANVNAI T 

I DG VT PLFNACSQGS P S CAELLL EYG AKAQP \ ES CLPS P 


5720 
■ S721 


1 


1051 


LQAFRNASEVPMVLVGTQDAISAA\NPRVYRRTSRARKLSrDLK " ' 

\RCT\YYE\TCGGTYGLQMMSVSFQDVAQKWAL\RKKQQ\LAI 

GPCK\SLPM\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 

Y\SSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 

RKGADP\DRE KKAAGCKVDS I GSGRAI PI KQG I LLKRSGKSLNK 

EWKKKYVTIiCDNGLLTYHPSLHDYMQNIHGKEIDI*LRTTVKVPG 

KRL PRATPATAPGTSPRANGLS VERSNTQLGGGTGAPHSAS SAS 

LHSERPLSSSAWAGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
OKPASG 




97 


492 


RHS S P C CSLRRTERSSNAAVST / TT VQQ FKRFI ENYRRH IGCVA 
VF YAIAGGLFLERAYYYAFAAHHTG I TDTTRVG 1 I LSRGTAAS I 
SFMFS Y ILLTMCRNLITFLRETFbNRYVPFDAAVDFHRLI ASTA 


5722 


88 - 


1043 


VALD VLAGS S PGGGMAGALLGPR VHGI RAVLRVARGG VQAPGA"p 
GSLGVSHAAAPPARPQGAAQS PHRGRRHGGGGAGLP PPRS PRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQ PPAPRPAPAPDVRP PADAPAPAPAPA 
PPPPPHLGALTAGSGEERQS QPRABTLRLGRGAPLP\ PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGIAGT 
KSSTREIPEMI 


5723 " 


88 


1043 


VALDVLAGSS PGGGMAGAkLti PrVhG" It RAVLrVARGG VQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 

qesvpaststargprrvsrrlppqkpgprgrrrrpgagvgaprr 
grargqagllgrqgqggrgaereraalqarrgrrpgpepdqscg 
grprraaaapgrapadpqppaprpapapdvrppadapapapapa 
ppppphloaltagsgeerosopraetlrlgrgaplpXpraergg 

RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5724 
5725 1 


3 


1841 


FTNEAPPAPJjPDASASPIjSPHRRAKSLDRRSTEPSVTPDLLNFK 
KGWLTKQYEIX^WKKHWFAIJu^QSLRYY3U)SVAEEAADLDGEID 

lsacydvteypvqrnygfqihtkegeftlsamtsgirrnwiqti 
mkhvhpttapdvtsslpbeknksscsfetcprptekqeaelgep 
dpeqkrsrare\rrregrsictfdwaefrpiqqalaqervggvgp 
adth\dpwrpeaehgelererarrreeriikrfsmldatdgpgte 

DAALRWEVDRSPGLPMSDI.KTHNVHVEIEQRWHQVETTPLREBK 

qvpiapvhlssedggdrlstheltsllbkeleqsqkeasdlleq 

NRLIXJDQLRVALGREQSAREGYVLQATCERGFAAMEETHQICKIE 

dlqrqhqreleklreekdrllaeetaatisaieamknahreeme 
releksqrs'qissvnsdvealrrqyleblqsvqrelevlseqys 
qkclenahij^aleaerqalrqcqrenqel^^elnnrlaae 
itrlrtlltgdgggeatgsplaqgkdayelevpsgarpcltqlc 
tqepqgsaawplsyrwggtdlrqqesqgpgrskspeggeeq 




3 ' 


1049 


vnghseetsq& pnrtephdstics vdlg 1 s ts dls pqksgpvg ~" 
swkshsitnmeigglkiydilsdnXdlsshlqplk/ftsavtg 
knivrskaatllydqplqvftgsssssdiiisgtkaifkfdsnhn 
pe/qakynkrphkwahnlhlkymvlhsiisntvav\rsqrhfva 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine f C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fo Phenyl alanine, G-Olycine, 
HaHistidine, I«Ieoleucine, K-Lysine, 
LoLeucine, M»Methionine, N-Asparagine, 
P« Proline, Q-Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *aStop 
codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQTKSPNRPCQFSSSAPS/VDQRAQ/INQSYAKHSANMNPSNHN 
NVRANTAYHLHQRLGPARHGEMWAI S PNDRL I PAVTRS TI Q RQS 
SVSSTASVNLQDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQRPLSARTYS I DGPNASRPQSARPS INBIPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


j SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVKRLfeLHQRPLGAP 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/BS VGAG1 PGPSRAG 

; QGS PGGSGEGPPLSS PSQPLPADLPGATLPDVGLELEVRPLAVT 
GL I FHLG QAR TP PYLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPILIIjKETRRIiPWATGYABVINAGKSTHNEDQASCEVIjTVKKK 
AGAVTSTPNRNSS KRRSSLPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG/LPG 
NA I RAG VN PGRGP ASPFWDLSLPWDLWPPPTDHAPGAP DF PAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG / LPGRC PS AP WRAGS R P AAS CPDWI PGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHLAEGGA 
KGSPRRLADPQDIjPAGQMSLAPPFPPVAAVIRSNK 


5729 


1 


1525 


AGGAREVLTI^liGHFAGFVGAHKWNQQDAALGbATDSkfiPPGEi, 
CPDVI>YRTGRTLHGQETYTPRLILMDLKGSIjSSLKEEGGLYRDK 
QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KSIPNGKGSSPLPTATTPKPLIPTEASIRVWSPFLRVHLHPRSI 
CMIQKYNHDGEAGRLEAFGQGESVLKEPKYQEELEDRLHFYVEE 
CDYLQGFQILCDIaHDGFSGVGAKAABLLQDEYSGRGIITWGlil,P 
GPYHRGEAQRNIYRLLNTAFGLVHLTAHSSLVCPLSLGGSLGIiR 
PE PPVS FPYLHYDATLP FHCS AILATALDTVTCS \ Y RLCSS P VS 
NVHL\ ADMLS FCGKKWTAGAI I P F PLAPGQSL PDSLMQFGGAT 
P WTPLSACGE PSGTRCFAQS WLRGI DRACHTSQLTPGTPP PS A 
LHACTTGEEIIAQYUQQQQPGVMSSSHLLLTPCRVAPPYPHLFS 
SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQ APARETC VE CQKTVY PME RLLANQQVFH I S CFRCS YCNNK 
LSLGT YASLHGRI YCKPHFNQLFKS KGNYDEGFGHRPHKDLWAT 
KI ETEG F WERPRN FENCGRPLKS PGGEDCPSC * GGCPGSNY * AQ 
GSSSREKG3QASWNPKLRVA 


5731 


122 


443 


RSHRGELIPKDSCYMRKPPRRPkKRRQG/CALPQGCLTFKDVAI 
EFS LEE WKCLNPAQRALYRAVMLEN YRNLESVGLTS KDSW YMR K 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVTVTLVCGFTSFSFSLPLYLCGCliRF 
PERTCS QXiQQADWAPDFGPS S FVPS WGATATGARKFLI AFNI \N 
LLGTKE QAHR I ALNLREQ GRG KDQ PGRLXKVQG I GWYLDEKNLA 

QVSTNLLDFEVTALHTVYEETCREAQELSLPWGSQLVGLVPLK 
ALLDAA 


573* 1 


1 


460 


palqevna>ialawgkqyendartlfbftsgvndtesp 1 1 yrdes 
mrtacs pdglcsdgnglieiikcpftsrdfmkfriiggfeai ksaym 
aqvqysmwvtrknanyfanydprmkreglhyvvierdekymVas 

FDEI\VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNS PESLTSLLVliLTTANNLFVLIPAyS KWrAYTJTffTvFTV*! 
GSLFLMNLLTAI IYSQFRGYLMKSLQTSLFRRRLGTRAAFEVLS 
SMVGEGGAFPQAVGVKPONLLQVLQKVQLDSSHKQAMMEKVRSY 
GSVLLSAEEFQKIjFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
HYYFDYLGNLIALANLVS ICVFLVLDADVLPAERDDFIIiGILNC 
VFIVYYIiLEMLLKVFAIjGLRGYLS YPSNVFIX3LLTVVLLVLE I S 
TL\VCTDCHTQAGGRRWW/RLLSLWDMTRMLNMIjIVFRF1*RIIP 
SMKPMAWAS TVLGL 


5735 


2 


$40 


FFTPCVARAFNF PDQATVKKAA YSLPRVGGGTS CGLPQARR I SL " 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNT1AGRTYNDLNQ 
YPWPWVLTN^SEELDLTLPGNFRIILSKPIGALNPICRAVFYAE 
RYETWBDDQ3PPYH YKTHYSTATSTLS WLVR1 VS I FIELACLW Y 
LKILT 


5736 


1 


382 


GTRPSTKKSGYSP(XjVAVIJiCKGHQKENTAVAHSNQKADSAAQV' 
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SEQ 
ID 
NO: 


Predicted 
beginning 

iiUCicOLlue 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted endT 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide'" 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
H=Histidine # I=Isoleucine, K= Lysine, 
L= Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q*Glutamine, R^Arginine, 
S-Serine, T«Threonine , V^Valine, 
W=Tryptophan, Y-Tyxosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion. 
\t=possible nucleotide insertion) 








TARLS VTP PN LL PT VS FPQ PD3j PbNP V YSTTTEKLASDLiRANKN \ 
\ QES * * I LPDSGI PIP* T* TS YLQS TTHLRRAKLPOLLRR 1 


5737 


290 


1041 


1 KACLHLLSSFLTSNFLFNPLLPDSLYaVEARSQRANLGPCRRKR 
LQTLMKLAAG FQYSSH KDPSLSAXEKKTD YHNEARGP WPGWVG * 
RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLECGT P APLQLB I PPQPRGHPAPIPTGQAG PRDS G PGAS P * V 
ETRPLTDGRR*PGVRPVGWTPAHPAGTLRPRGAVEPSVSACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 

C'J'JQ 


8 


4£o 


DTLS LNCTLy ETLPMTP S F* LS FL * FPGLARAKS I P TKT YSNE V 
VTLWYR PPD I LLGS TDYSTQ I DM W * GQVE VWQGPCG KGGGLVTT 

ATQPAAFLFTVPSLPRGVGCIFYEMATGRPLFPGSTVEEQLHFI 
FR IliS EE AWALCAVBTHR 


O / Jl) 


1 


| 1222 


SFQRRGIRWNVHTLHPHPRAVWAGlGRGHGS*AlXGRARAPALC 
FP TLLE FLES LEPDLPALRAMGLHLWAAGPGTHPAGI SDLLAE V 
S AEVDGPVPGYLSS PQS I TDTCL Y I FTSGTTG LPKAAR I S HLKI 
LQCQGFYQLCGVHQEDVI YLALPLYHMSGSLLGIVGCMG IGATV 
VLKSKFSAGQFWEDOQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HK VRLAVG SGLRP DTWERF VRR FG P LQ VL ETYGLTEGNVATINY 

TGQRGAVGRASWLYKHIFPFSLIRYDVTTGEPIRDPQGHCMATS 
P G E PG LL V AP VS QQ S PFLG YAGG PE LAQG KLLKD V FR PGDVFFN 
TRDLLVCDDQGFLRFHDRTGDP FRWKGENVATTEVAE V FE ALDF 
LQEVNVYGVTV 


5740 
5741 


265 


231 


PAYWLKVPTLCLESKTDLRBKASHVSAQLQGEVRGLAGALWM*A 
YVYERVYN*NISRMVHALEQKRHPAGLSSSMALQLNPCLGMI*IA 
1 LQS E LHKLYDEETQSWVS GS ACGG YP 


5742 


1 


650 


PRKTMRRGVLMTLLQQSAMTLPLWIGKPGDRPPPL03AIPASGDn 
YVARPGDKVAARVKAVDGDEQWIIAEVVSYSHATNKYEVDDIDE 
EGKERHTLS RRRVI PLPQWKANPBTDPEALFQKEQLVIALYPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTS YADGYS PPLNVAQRYW 
ACKEPKICK*CRLADSPSPNDTGQDSRGRAGIKHIPPLKKK 




2 


362 


TQS VKE I LKRNPNVNLTDKDGNTALM I ASKEGHTE I VQDLLDAG 
TYVNI PDRSGDTVLIGAVRGGHVE I VRALLQKYADIDIRGQDNK 
TALYWAVEKGNATMVRDI LQCNPDTE I CTKDG | 


5743 
5744 


2 


415 j 


GKTPEG IDA I EE IE I DLBETKREI S P QfeNGLteE VKPLGEMQTDL | 
KATGRE I SPREKTPEVIDATEEIDKDLEETGRRB I S PEENGPEE 

VKPVDEMETDLKTTGREGSSRBKTREVIDAAEVIETDLEETERE 
ISPQE 1 




3 


703 

5$9 -4 


TRRTTTTSPTTTRQMTTTPAALPTTWTTPDLTTGTPLQMrTIA' H 
VFTTANTCLSLTPS TLPEEATGLLTPE PS KEGPILTAESETVLP 
SDSKSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGAS DTAVPEQNKTTKTGQMDG I PMS MKNEMP IS QLLM I IAP 
SLGFVLFALFVAFLLRGKLMET YCSQKHTRLDY I GDS KNVLNDV 
QHGREDEDGLFTL 


5745 


1400 




GKSRFVNLMKHS KKTYDS FQDELED Y I KVQKARGLBPKTCFRKM 
KGD YLETCG Y KG EVNS RPT YRMFDQRLPSETIQTYPRSCNI PQT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 
YICGSHGVEHRVYKHFSSDNSTSTHQASHKQIHQKRKRHPBEGR | 
EKSEEERS KHKRKKS CEE I DLDKHKS I QRKKTEVEI ETVHVS TE 
KLKNRKE KKSRD WS KKEERKRTKKKKEQGQERTEEEMLWDQS I 
LGF 1 


5746 


3 


821 


S FASGRLTPS5 PAFDGELDIiORYSMdp A vq &M Q t /iMnanou OT ^„ ~| 

RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 
LEERALLREARLGRARSSGGMQATPATEGLARPQAPSSSAFRCP 
YCKGKFRTSAERERHLHILHRPWK03LCSFGSSQEEELLHHSLT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 
EATPTPAPAAPEEPPAPPEFRCQVCX3QS FTQSWFLKGHMRKHKA 
SFDHACPV 


5747 


2 


1328 -h 

| ] 


0RHVETL C IHFIX3 PSTG STAKTGGRNWLKTGNCL YGNTCRFVHG~H 

PSPRGKGYSSNYRRSPBRPTGDLRERIKNKRQDVDTEPQKRNTE 

SSSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A»Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methion.ine, N»Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S»Serine, T«Threonine f V-Valine, 
W»Tryptophan, Y-Tyrosinc, x=Unknown, **stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








SDNGDINYDYVHELSLEMKRQKIQRELMKLBQBNMEKREfe^IIK ' 

KEVSPEWRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 

AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPBDIALGKKYXE 

KYKVKDRIEEKTRDGKDRGRDFERQREKRDKPRSTS PAGQHHS P 

ISSRHKSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 

ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 

ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 

RE 


5748 


934 


473 


s eg pq vfykglaptli a i fpyaglqpScV6slkhlykwai paeg 

KKNENLC^JLLCGSGAGVISKTLTYPLDLFKKRLQVGGFEHARAA 
FGQVRR YKGLMD CAKQVLQKEGALG FF KGL S PS LLKAAL3 TGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS " 
SA5ST YSS AEERMQSEQ I RKLRRBLES S QE KVATLTSQL S ANAN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAVIQGAtiNASETTPKELRI KRQNSSDS IS SLNS1TSHSS I 
GSSKDADA 


5750 


22 


866 


I FI S I CLWKAHLCFIJtiLPKDCIDQVMKLQNLFVDDS GR YLAIQF 
IIIiEWAYVFLYYYEYRKAKDQLDIAiCDrSQLQIDLTGALGKRTRF 
QENYVAQLHiDVRREGDVLSNCEFTPAPTPQBHLTKNLELNDDT 
ILNDIKIjADC^QFQMPDLCAEEIAII1X5ICTNFQKNNPVHTLTE 
VELLAFTSCI*LSQPKFWAIQTSALILRTKLEKGSTRRVERAMRQ 
TOALADQPEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTSSALQIFEKLEMWE 


5751 


3 


751 


SCGS ALRAWRCGAAALATFPAPAIipGLMYRALYAFRS AE PNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRRLQGIjEQ 
DVLQAI DRAI EAVHNTAMRDGGKYSLEQRGVLQKL I HHRKETLS 
RRGPSASSVAVMTSSTSDHHLDAAAARQPNGVCRAGFERQHSLP 
SSEHIiGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
IiMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 

> 


471 


gpvcgvgLsVawagpwrgpvhsVggggraalhgaelpcx^gaat 

VEREMELRHKNEMLRVETEARARAKAERENADI IREQIRLKASE 
HRQTVLESIRTAGTLFGEGFRAFVTDRDKVTATVNIFIKQGWQV 
AERQHVGASWS PRSCPCRLCTAL 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRNI YTPRTGHRIRKLDQ IQSGGN YVAG 
GQEAFKKLNYLDIGEIKKRPMEVVNTEVKPVIHSRINVSARFRK 
PLQEPCTIFLIANGDLINPASRLLIPRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHVVEFAGEHAEAIASREQEVLQGWKELLSACEDARLHVSST 
ADALRFHSQ VRDIiZjS WMDG I ASQ IGAADKPRCPSS LLGIiPAS P W 
WPTPATPSPLTAPFSME 


5755 


3 


888 


LGDQFYKEAIEHCRSYNSRLCAERSVRLPFLDSQTGVAQNNCYl' 
WMEKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDPKLRLLE I K 
PBVELPLKKDGFTSESTTLEALLRGEGVEKKVDAREEES IQE IQ 
R VLENDENVEEGNEE EDLE E D I P KRKNRTRGRARGS AGGRRRHD 
AASQEDHDKPYVCDICGKRYKNRPGLSYHYAHTHLASBEGDEAQ 
DQETRSPPNHRNENHRPQKGPD3WIPNNYCDFCLGGSHMNKKS 
GRPEELVS CADCGRSAHLGGEGRKE KEAAA 


5756 


3 


621 


SSKIiOALFAHPLYNVPEEPPLIiGAEDSLLASQEALRVYRRKVAR" 
VWRRHKMYREQMNLTSLDPPLQLRLEASWVQFHIiGINRHGLYSR 
SS P WSKLLQDMRHF PTIS AD YS QDEKALLGACDCTQ I VKPSG V 
HL KLVLRFS D FGKAMFKPMRQQRDEETP VDFFYF IDFQRHNAE I 
AAFHLDRI LDFRRVPPTVGR I VNVTKEI L 


5757 


3 


473 ' 


YKDALLLPDNHRQWFENGTLKLTOVQKGMDEGEYtiCSVLIQPQ 
LSISQSVHVAVKVPPLIQPFEFPPASIGQIiLYIPCWSSGDMPI 
RITWRKDGQVIISGSGVTIESKEFMSSLQISSVSLKHNGNYTCI 
ASNAAATVSRERQLIVRVPPRFW 


5758 


1 


474 


PRRGAGAERGEHREGERGAAGMGE FKVHR VRFFNYVPSQ I RCVA 
YNNQSNRLAVSRTDGTVE I YNLS ANYFQEKFFPGHES RAT3ALC 
WAEGQRLFSAGI^IGEIMEYDLQALNIKYAMDAFGGPIWSMAASP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of. 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


awiu ocgmenc containing signal peptide 
(A-Alanine, C«=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H«Histidiue, I-lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknovm, *«Stop 
Codon, /^-possible nucleotide deletion, 
\=possible nucleotide insertion) 




2 


1240 


SGSQLLVGCEDGSVKLFQITPDKIPV " 

GNAAFAGQGV V ^HU'FHMSDLP^T1 > NGTVHVVVNNQIG*FITDPR" 
MARS S P YPTDVAR WNAP I FHVNADDPBAVI YVCS VAAE WRNTP 
NKDVGADLVCYRRRGHNEMDE PMFTQPLMYKQ I HRQVP VIiKKYA 
DKL I AEGT VTLQE FEEE I AKYDRI CBEAYGRS KDKKI LH I KHWL 
DSPWPGPFNVDGEPKSMTCPATGIPEDMLTHIGSVASSVPLEDF 
KIHTGI^RILRGRADOTKNRTVDWALAEYMAFGSLLKEGIHVRL 
KGQDVERGTFSHRKHVLHDQEVDRRTCVPMNHLWPE>QAPYTVCN 
S S LS E YGVLGFELG YAMAS PNALVLWEAQFGDFHNTAQCI IDQF 

ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDF3VSQL 


576C 


1 


1221 


VRDITSDSLSLSWTVPEGQFDKFLVQFKNGDGQPKAVRVPGHED - 

GVTISGLEPDHKYKMNLYGFHGGQRVGPVSAVGIjTAPGKDEEMA 

PASTEPPTPEPPlKPRIiEELTVTDATPDSLSLSWTVPEGQFDHF 

LVQ YKNGDGQPKATRVPGHEDRVTI SGLEPDNKYKMNLYGFHGG 

GRVGPVSAIGVTAAEEETPTPTEPSMEAPBPPEEPLLGELTVTO 

SSPDSLSLSWTVP<^RFDSFTVQYKDRDGRPQVVRVGGEESEVT 

VGGLEPGRKYKMHLYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 

TEPGTEAPEPPEEPLLGELTVTGSSPDSLSLSWTVPQGRFDSFT 

VQYKDRDGRPQAVRVGGQESKVTVRGbEPGRXYKMHLYGLHEGR 

RLGPVSAIGVT 




3 


1275 


SCDMAEAAAJ4VWIR3PGFGCKAVR<^GRCTVRD^iHRHCQi55FT~ 
VPVENFFVKCKGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ IEKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERB 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAAS S KM VSAE I SENRKRQWPTKSQTDRGAS AGKRRCFWLGM 
EGLETAEGSN S ESS DDDSEEAPSTS GMG FHAP KIG SNGVEMAAK 
FPSGSQRARWNTDHGSPEQLQIPVTDSGRHILEDSCAELGESK 
EHMESRWVTETEETQEKKAES KEPI BEE PTGAGLNKDKETEERT 
DGER VAE VAPEER ENVAVAKLQESQ PGNAV I DKE T ID LLAFTS V 
AELELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTCQTP LHSQGGG<jGSGGGRRRTPRGMPKEKYB P PD PRRM YT I 
MSSEEAANGKKSHWAELEISGKVRSLSASLWSLTHLTALHLSDN 
SLSRI PSDI AKLHNLVYLDLSSNKIR 


5763 " 


3 


429 


lilJKOTOIjIMLIARLDYELIQRFTLTIIARDGGGEETTGRVRIWV " 
IiDVNDNVPTFQKDAYVGALRENEPSVTQLVRLRATDEDSPPNWQ 
ITYS I VSAS AFGS YFDIS LYEG YGVIS VSRPLDYEQ I SNGL I YL 
TVMAMDAGN 


5764 


19 


441 


VCARACGEMrOi-IiRPIDRORYDENEDLSDVEBIVSVRGFSLBEK 
LRSQLYQGDFVHAMEGKDFNYE YVQREALRVPLI FREKDGLGIK 

MPDPDFTVRDVKLLVGSRRLVDVMOVNTQKGTSMSMSQFVRYYE 
TPBAQRDKL 


5765 
Bite ■ 


3 


825 


qkilri^shqpptsssnskdosgpas^gaqataaLadgLkIf'as 

VQAS APQGNS HKETS KS KVKRS KTS KDANKS LPS AALYGI PE I S 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRJDKDAGKSRKDKHDIiLQ 
GHQNGSG S QAPSGGHL YG FGAKSNGGGAS PFHCGGTGSGS VAAA 

GBVSK^PDSGLMGNSMLVKKEEEEEESHRRIKKJ^EKVDPLF 
TVPAPPPHV 




i4oe 


663 


SGLF S VDPASSQAMBLSDVTL 1 EGVGNEVMWAGih/VLILALVL 

* »*^^«««"w*«AMiA va/U7ixioVutll^HVDnLiVAGQuNPE | 
PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEPSLEHLLD 1 
IQGLP KRQAGAGSSS PE APLRSEDSTCLP P 8 PGLI TVRLKFLND 
TBEIAVARPEDTVGAIiKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVPVFVVLLGVVWYFRINYRQFFTAPATVSLVGVTVFFSFLV 
FGMYGR 


5767 - 


2 


692 


NFRATPRPPTRPklaRTGTEVlLWYlJDWRALMKRKRMKANIKLVG 
SG FPL PSSDLDDS LTEE I DE KIG FROT3ANFDWQNVADFRDAGGS 
LTEVKVEEBERDPQSPEFEIEEEBEMLSSVIPD3RRENELPDFP 
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ID 
NO: 



5768 



"5763" 



"5770" 



5771 



"5772- 
■5773- 



5775 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



38 



16T 



T76- 



667 



741 



T48- 
_ 



■577T- 



5777 



5778 



383 



723 



538 



-484- 



949 



1210 



Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, Methionine, N«Asparagine , 
P«Proline, Q«Glutamine, R-Arginine, 
S=Senne, T=Threonine, V=valine, 
WsTryptophan, Y=Tyrosine, X -Unknown, **stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
HlUEFFTLNSTPSR^AVDEPHLLVNIEKQKLELfiKRRLDISAER 
LQVEKERLQIEKERLRHLDMEHERLQLEKERLQIEREKLRLQIV 
NSEKPSLBNELGQGEKSMLQPQDIETEKLKLERERLQLEKDRLQ 
FLKFSSEKLQIEKERLQVEKDRLRIQKEGHIjQ 



SSRSki^VSVSPPPPGIVELGPPPAWEFCSRLGSAVTSQRAGPA 

I aamvakdypfyltvkrancslelppasgpakdabepsnjcrvkpl 

I S RVTS LANL I P P VKATPIjKRFSQTLQRS I S FRS ESR P D I LAPR P 
WSRNAAPSS rKRRDSKLWSETFPVC 

TKTKXG VKE KATDQS VKAFAEI ICPE LQYVGFMGCS VTS KG VIHL 
TKLRNLSSLDLRHITELDNETAMEIVKRCKNLISLNLCIiNWIIN 
DRCVEVIAKEGQNLKELyLVSCKITDYAI.IAIGRYSMTIETVDV 
GWCKEITDQGATLIAQSSKSLRYliGLMRCDKVNEVTVEQLVQQY 
PHITF5TVLQDCKRTLERAYQMGWTPNMSAASS 



484 I" DS RR YD VKTR kWSFLLEEHSKL IAKVRCLPQ VQLDPL PTTLTLA 

FASQLKKTSLSLTPDVPEADLSBVDPKLVSNLMPFQRAGVNFAI 
AKGGRLLIADDMGLGKTIQAICIAAFYRKEWPLLWVPSSVRFT 
WBQAFLRWLPSLS PDCINWVTGKDRLTA 



GLLPSACLRARSWREASEGPSSRACSNGSUlri-FEACYSGTSTPfi 
FHGS HCSGSDHSS IX3LBQLQD YM VTLRSKLGPLE I QQ FAMLLRE 
YRLGLPIQDYCTGLLKLYGDRRKFLLLGMRPFIPDQDIGYFEGF 
LEGVG IREGGILTDS FGR I KRSMSSTSASAVRS YDGAAQRPEAO 
AFHRLLADITHDIE 



EFNLALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
ILTFSNIiVTCSAIYHLPVFPEREPGCSMRDLRVA 



PRVRSKHNFCFMEMNTRLQVEHPVTEMITOTDLVEWQLRIAAGE 
KIPLSQEEITLQGHAFEARIYABDPSNNFMPVAGPLVHLSTPRA 
DPS TR I ETGVRQGDEVS VH YDPMI AKLVVWAADRQAALTKLR YS 
I^QYNIVGIJiTNIDFLLNLSGHPEFEAGNVHTDFIPQHHKQLLL 
SRKAAAKESLCQAALGLI LKE KAMTDTFTLQAHDQFS PFS S S SG 
RRLNISYTRNMTLKDOKNSK 



"555 I FVEBENIR WRCGGSELN FRRA VFS ADS KY I FCVSGDF VKVYST 

VTEE CVHILHGHRNLVTGIQLNPNNHLQLYSCS LDGTIKLWDYI 
DGILIKTFIVGCKLHALFTLAQAEDSVFVIVNKEKPDIFQLVSV 
KLPKSSSQEVEAKELSFVLDYINQSPKCIAFGNEGVWAAVREF 
YLS VYFFKKETTS RVTLS SS 



SSGCCDPAAPSSIAEAATMPVSKCPKKSESLWKGWDRKAQRNGlT 
RSQVYAVNGDYYVGEWKDNVKHGKGTQVWKKKGAI YEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGS QR SGWGRMYYSNGD I YEGQ WENDKPNG EGMLRLS QNP 



RLPQD CVCQNIJS ES LGTL CPS k&LL FVP PD 1 DRR,TVEL RLGGNF 
1 I HIS RQDFANMTGLVDLTLS RNTISHI QP FSFLDLES LRS LHIi 
j DSNRLPSLGEDTLRGLVNLQHL I VNKNQLGGIADEAFEDFLLTL 
EDLDLS YNNLHG PAVGLRGDAHVQPS TS 



GQDPEPGQDLFQPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ* 

1 VKKLEOALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 

GGSGSEVSQRVHPSDLEGRBPTPELVBDRKGSCRRPWDRSLBNV 

YRGSEGSPTKPFINPLPKPRRTFKHAGEGDJCDGKPG I GFRKEKR 

NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 

DLLQSSSESSRVDWYAQTKLGLTRTLSEENVYEDILDPPMKENP 

YEDIELKGRCLGKKCVLNFPASPTSSIPDTLTKQSLSKPAFFRQ 
NSERRNV 



QRRQSVS RLLLPVFLLEPPAE PGLBPPPEKEGGEPAGVAEEPGS 
GGPGWLQLEEVPGPGPLGGGGPLRSPSSYSSDELSPGBPLTSPP 
WAPLG^PERPEHLIiNRVLERLAGGATRI)SAASDILLDDIVLTHS 
LFLPTEKFLQELHQYFVRAGGMEGPEGLGRKQACLAMLLHFIiDT 
YQGIiLQEEBGAGHI IKDLYLLIMKDESLYQGLREDTLRLHQLVE 
TVBLKI PEENQPPSKQVKPLFRHFRRIDSCIjQTRVAFRGSDEI F 
CRVYMPDHSYVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 
LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFACTRDSYEALV 



369 



WO 01/53312 



PC1YUS00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A*=Alanine, C=Cysteine, D=Ast>artic Acid e» 
Glutamic Acid, F»Phenylalanine, G-Glycine, 
HaHistidine, I»Isoleucine, K«Lysine, 
L-Leucine, M-Methionino, N=Asparagine , 
P= Proline, Q-Glutamine, R*Arginine, 
S-Serine, T-Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


£779"" 






PLPEEIQVSJt^utEIHRVEPEDVANHLTAFHHELPRCVHSliEFV 
DYVFHGE 




131 


1S71 


EAVQVLIKHaADVNARDKNWQTPLHVAAANKAVKCAEVIIPLLS 
SVNVSDRGGRTALHHAALNGHVEMVNLLLAKGANINAPninmnp 

ALHWAAYMGHLDVVALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NWKHL LNLG VE I DE I NVYGNTALHI ACYNGQDAVVNE L I D YGA 
NVNQPNNNGFTPLHFAAASTHGALCLELLVNNGADVNIQSKDGK 
SPLHMTAVHGRFTRSQTLIQNGGEIDCVDKDGNTPLHVAARYGH 
EJLLlNTLITSGApTAKCGIflSMFPLHLAALNAHS VCCRKLLSSG 
QKYS I VSLFSNEHVLSAG FEIDT PDICFGRTCLHAAAAGGNVEC I 
KLLQSSGADFHKKDKCGRTPLHYAAANCHFHCIETLVTTGANVN 
ETDDWGRTALH YAAASDMDRNKT I LGNAH DNSEBLERARELKEK 
EATLCLEFLLQNDANPS IRDKEG YNS IHYAAAYGHRQCLELLLE 
RTNSGFSESDSGATKSPLHIAVSEMP 


5780 
5781 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPBLTTVAEVDESNGEEKSEPVS 
EI E TS WKGSHFPVGWP PRAKS PT PE S <3T T A q wtt ,v ir« vvmm 
DLRTER PRSAVEQLCLAES TRPRMTVE EQMER IRRHQQACIiREK 
KKgLNVIGASDQSPLQSPSNLRDNP 




19 


941 


RGSI^GHPWRP^MRAASOJSCLPVSFVTGPHQERAYtiGRGP^GAF^ 
PAPP VSGTCPPDL I YAPTPEKAEGGSQ KNHQ PPPGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE \ PGEEPRRPLDRS PPLGQ 
VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
QHS1HTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPLP 
VAPSRTCGGC*TWDPALLVSP/PC3GDSTPELPAP\QQPTGGPSR 

CRQALPPQG+RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 " 


1237 


DRSMMSMAACSYTDSYTDTYTEAYMVPPtPPEEPPTMPPLP'P^E - 

PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 

SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 

PE PESS I TLTPVESAWABEHEWPERP VTC5KVS ETPAMSAEPT 

VLASEP PVMSETAETFDSMRASGHVASEVSTSIiIiVPAVTTPVLA 

ESILEPPAMAAPESSAMAVLESSAVTVLESSIVTVLESSTVTVL 

BPS WTVPEPPWAEPDYVTI PVPWSALEPSVPVLEPAVSVLQ 

PSMIVSEPSVSVQESrVTVSEPAVTVSEQTQVIPTEVAIESTPM 

I LESS IMSSHVMKGINLSSGDQNLAPE IGMQEIALHSGEEPHAE 

EHLKGDFYESEHGIWIDLNlNNHIilAKEMEHNTVCAAGTSPVGE 

IGEEKILPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 

ATG\TSKGI SFTTASTLSLVNKYDVDLS LTTQDTEHDMLI STS P 

SGGS EAD I EG PLPAKJD I HLDLPSNXNLVS SDTNEPLP VKRD\ DQ 

TLAALI \SLXESSGGEKEVPPPS * REHLPDSGFSANIEDINEAD 

LVRPV^PRTWNVLPSPRAGL\EGP\LLASDFGPVQNLYSSPVV 

\SSMP\ERASGS\SSGEKGG\YEIFVKVKDTR^KSKKNKNRDKG 

EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 

H RS \QTRSRSRS /RDRRRRSSRS rsksrgrrs vs kekrkrs pkh 

RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 

SVGRRRSFSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 

RTPSRRSRTPSRRRRSRSWRRRSFS IS PVRLRRSRTPLRRRPS 

RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 

GVPLPPNLKPAPPPTIEEKVAKKSGGATIBELTEKCKQIAQSKE 

DDDV I VNKPHVSDEEEEEPPF YHHP FKLS E PKP I FFNLK I AAAK 

PTPPKSQVTIiTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 

KDDDNVFSSNLPSEPVDISTAMSERALAQKRLSENAFDLEAMSM 

LNRAQ E R I DAWAQ LNS I PGQFTGS TGVQVLTQ BQLANTG AQAW1 

KKDQFLRAAPVTGGMGAVLMRKMGWREGEGLGKNKEGNKEPILV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQP PE FI1I1VHDSGPDHRKHFLFR Vtil NGSAYQPNCMFFLNR 
Y 


5783 


1693 


698 

■ 


□SGLR VAFTMEG I SNFKTPS KLS E KKUKs VLCSTPTIN I PAS PFM 
QKLGFGTG VNVYIiMKRS PRGLSHS PWAVKKIKP I CNDH YRS VYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSL 
MDLIEE/PI*SQ/PKH,FQQP/LILKVALNMARGLKYLHQEKKL 
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to first 
amino acid 
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Predicted end 
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Amxno acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, 2= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I=I$oleucine, K«LyBine, 
L=Leucine, M^Methionine, N»Asparagine, 
P«Proline, Q«Glutamine, R=Arginine, 
S» Serine, T=Threonine, V=» Valine, 
WoTryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








lhgdikssnwikgdfetikicdvgvsLpldenmtvtdpeacyi 

GTEPWKPKEAVEENGVITDKADIPAFGLTLWEMMTLSIPHINLS 
NDDDDEDKTFDESDFDDEAYYAAIiGTRPPINMEEIiDESyQKVlE 
LPS VCTNEDP KDRPS AAHI VEALETDV 


5784 


2669 


1388 


' PRVRPR VRTPHNYYI SR I YGPSDSASRDLWVNTDQME KDKVK IH " 
GIIjSNTHROAARVWIiSPDPPFYGHFIiREITVATGGFIYTGEWH 
RMLTATQYIAPLMANPDPSVSRMSTVRYFDNGTALVVQWDHVHL 
QDNYNLGS FTPOATLLMDGRI I FG YKBI PVL VTQ ISSTNHP VKV 

glsdapwvhriqqipnvrrrtiyeyhrvelqmskitnisavem 
tplptclq fnrog pcv8 s qig fncs wcs klqrcssgfdrhrqdw 
vdsgcfbeskekmcentepvet\fleppqp*srqppssgs*lpp 
e/davtsqfptslpteddtkialhlkdngastddsaaekecggtl 
haglivgililvlivataxlvtvymyhhptsaasiffierrpsr 
w pamkfrrgsgh payae vepvgekegf i vseqc 


5785 


2669 


1388 " 


PRVRPRVRTDHNYYISRiYGPSDSASRDLWVNIDQMEKDKVKIH 
G I LSNTHRQAARVNLS PDPPP YGH PLRB I TVATGGF I YTGEWH 
RMLT ATQ Y IAP LMANFD PS VS RNS TVR YFDNGTAL WQ WDHVHL 
QDNYJTLGS FTFQATLLMDGRI I FGYKEIPVLVTQISSTNHPVKV 
GLSDAFWVHRIQQI PNVRRRTI YEYHRVELQMSKITNI SAVEM 
TPLPTCLQ FNRCX3PCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESXEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E / DAVTSQ FPTSLPTEDDTK IALHLKDNG ASTDDSAAEKKGGTL 
KAGLIVGILILVLXVATAILVTVYKYHHPTSAASIFPIERRPSR 
W PAMKFRRGSGHPAYAEVEP VGEKEG FIVSEQ C 


5786 


25*2 


1674 


SYKLPAAERRASSCSQPPTPTRRRWPAPGRTSRGHRPQM*SGTP ' 
APRPPARSTVSPASPLPKPRAGROGSRPRSACSTPRPC*SLN*M 
S*H*KRNLSQRSSSMSP^PLSC^PHR**RQGLTVAARLP1'WAK 
SPPLACSFCQAAQKSQSLSSGRSTR*PBRMSFRP\SPPGNPAIP 
SLAPSSRP/ PKGRPQCTWI PSRWPASPTAPPTTT*APTS S PGST 
GRSMMTCPTR WTATPWSARASSRPRNWPTP * WR PSGRLS TV* RA 
TGGSTATAP PKRFPRNWNPMMAE 


5787 


2 


1460 


MASAASVTSLADEVNCP\ICQGTLKEAGSLSNCG/HKNFCRACL ~ 
T\RYCEIP\GPD\LEESP\TCP\LCKEPPRP\GSFRPNMOLANV 
VENIERLQLVSTLGLGEEDVCQEHGEKIYFPCEDDEMQLCWCR 
EAGEHATHTMRFLEDAA\APYREQIHKCLKCLIKEREEIQBIQS 
RENKRMQ VLLTQVS TKRQQVI SEFAHLRKFLBEQQS ILLAQL2S 
QDGDILRQRDE FDLLVAGE I CRFSALIEELEBKNERPARBLLTD 
IRSTLIRCETRKCRKPVAVSPBLGQRIRDFPQQALPLQREMKMP 
LEKLCPELDYEPAHISLDPQTSHPKLLIjSEDHQRAQFSVKWQNS 
PDNPQR FDRATCVLAHTG I TGGRHTWWS I DLAHGGSCTVG WS 
EDVQRKGEIJRLRPEEGVWAVRIiAWGFVSAIjGSFP\TRLTLKEQP 
RQVRVSLD YE VG WVTFTNAVTREP I YTFTA5FTRKVI PFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHSVSGRSSA YGDATAEGHPAGPGS frGAI S TfrTGHQEGDG " 

SEGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 

AIPYMQVILMLTTDLDGEDEKDRGMDNLLSQLIAELGMDKKDV 

SKKNERSALNEVHLWMRLLSVFMSRTKSGSKSSICESSSLISS 

ATAAALLS SGAVD YCLHVLKSLLEY WKSQQNDBE PVATSQLLKP 

HTTSS P PDMS PFFLRQYVKGHAAD VFEAYTQLLTEM VLRLP YQ I 

KK I TDTNSR I PP PVFDHSWF YFLSE YLMIQQTPF VRRQVRKLLL 

F I CGS KE KYRQLRDLHTLD S \H VRGI KKLLEEQG 1 FLRAS WTA 

S PQSALQ YDTL I S LMEHLKACAE I AAQRTINWQKFC I KDDSVLY 

FLLQVS PLVDEGVSPVLIiQLIiSC^CXSSKVLRALAASSGSSSAS 

SSPAPVAASSGQATTQSKSSTKKSKKBEKBECEKDGETSGSQEDQ 

LC^ALVNQLNKFADKETLIQFLRCFLLESN'SSSVRWQAHCXTLH 

IYRNSSKSQQELLLDLMWSIWPELPAYGRKAAQFVDLLGYFSLK 

TPQTEKKLKEYSQKAVEILRTQNHILTNHPNSNIYNTLSGLVEP 

DGYYLES D PCLVCNNPE VPFCY I KLSS I KVDTRYTTTQQWKLI 

GSHT1SKVTVKIGDLKRTKMVRTINLYYNNRTVQAIVELKNKPA 

RWHKAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 

TETLQCPRCSASVPANPQVCGNCGENVYQCHKCaiSlNYDEiaDPP 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A*Alanine, C-Cysteine, D«Aspartic Acid, Et, 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
KaHistidine, I*isoleucine, K=Lysine, 
I»=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R^Arginine, 
SsSerine, T*»Threonine, V-Valine, 
^Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








lcnacgfckyarfdfmlyakpccavdpienskdrkkaV^nIntl 
ldkadrvyhqlmghrpqlenllckvneaapbkpqddsgtaggis 

STSASVNRYILQLAQEYCX3DCKNSFDELSKIIQKVPASRKELLB 
YDLQQREAATKSSRTSVQPTPTASQYRALSVLGCGHTSSTKCYG 
CAS AVTEHCI TLLRAIiATNPALRHI LVSQGLI RELPDYNLRRGA 
AAMREE VRQLMCLLTRDNPEATQQMNDLI IGKVSTALKGHWANP 
DLASSLQYEMLLLTDSISKEDSCWELRLRCALSLFLMAVNIKTP 
VVVENITLMCLRILQKLIKPPAPTSKKNKDVPVEALTTVKPYCN 
EIHAQAQLWIiXRDPKAS YDAWKKCLPIRG IDGNGKAPSKSELRH 
LYLTEKYVWRWKQPLSRRGKRTSPLDLKLGHNNWLRQVIjFTPAT 
QAARQAACTIVEALATIPSRKQQVLDLLTSYIjDELSIAGECAAE 
YLALYQKLITSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 

tlstdlqqgyalksltgllssfvevesikrhpksrlvgtvlngy 
lclrklwqrtklidetqdmllemledmttgtesetkafmavci 

ETAKRYNIjDDYRTP VP I FERLCS 1 1 YPEENEVTEFFVTLE KDPQ 
QEDFLQGRMPGNPYSSNEPGIGPIiMRDIKNKICQDCDLVALLED 
DSGMELLVNWKIISLDLPVAEVYKKVWC^TNEGEP^!RIVYR^^RG 

LLGDATEBFI es lds ttdeeedeee vykmagvmaqcggle cmln 

RLAGIRDPKQGRKLLTVLLKLPSYC^KVKVNRQQLVKLEMNTLN 
VMLGTLNLAL VAEQES KDSGGAAVAEQVLS I ME I \ I QAEPNVEP 
LSEDKGNLLLTGDKDQLVMLLDQINSTFVRSNPSVLQGLLRIIP 
YLS FGE VE KMQI LVER FKP YCNFDXYDEDHSGDD KVFL\ DCFCK 
I AAG I K\NNSNGHQL\KDL\ I LQKG I TQNALD\ YMKKHI P/ S AA 
RIWDADI\WKSFCLRPALPFILRLLRGLAIQHPGTQVLIGTDSI 
PNLHKLEQVS\SDEGIGTLA\ENL\LESLREHPDVNKKinA\AR 
RETRAEKKRMAMAMRQKAIX»TLG\MTT>JEKGQVVD/lllTALLEA 
DWEEH EEP\GLTCCICREG YKFQPTKVIiGI YTFTKRWIjGGVW 
ENKPRETSRATS TVSHFNI VH YDC \HLA\AVS LARGREE WESAA 
LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 
TYQLNIHDIKLIiPLRFAMEQS FSADTGGGGRBSNIHLI P YI IHT 
GLYVLNTTRATSREEKNLQGFLEQPKBKWVESAFEVDGPYYFTV 
LALH I L P P E QW RATR VE I LRRLL VTSQARA VAPG GATRLTD KA V 
KDYSAYRSSLLFWAL VDL 1 YNMFKKVPTSNTEGGWS CS LAE YI R 

HNDMPIYEAADKALKTPQEEFMPVETFSEFLDVAGIiLSEITDPE 
SFLKDLLNSVP 


5789 


1 


240? 1 


LPLHAVEKTGRPGQPALKMPGKLRSDAGLESDTAMKKGETLRKQ 
TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKVVSSKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNSEIBQEIPVEQKEG\APSNFPISEETIKI* 
LKGRGVTPLFPIQAKTFHHVYSGKDLIAQARTCTGJCTFSFAIPL 
I E KLHG\ ELQDRKRGRAPQVLVLAPTRE LANQVS KDPSDI TKKL 
S VACFYGGTPYGGQFERJMRNG I DILVGT PGRIKDH IQNGKLDLT 
KLNHVVLDBVDQMLDMGFADQVEEILSVAYKKDSBDNPQTLLPS 
ATCFHWVFNVAKKYMKS TYEQVDLIGKKTQ KTAI T VEHLAI KCH 
WTQRAAVIGDVIRVYSGHQGRTI I FCETKKEAQELSQNSAI KQD 
AQSLHGDI PQKQREITLKGFRNGSPGVLVATNVAARGLDIPEVD 
LVIQSSPPKDVESYIHRSGRTGRAGRTGVCICFYQHKEEYQLVQ 
VEQKAGI KFKRIGVPSATEI IKASSKDAI RLLDS VP PTAI SHPK 
QS AEKL IEE KGAVEALAAALAH I SGATS VDQRSLINSNVGFVTM 
ILQCS I EMPNI S YAWKELKEQLGEE I DS KVXGMVPLKGKLGVCF 
D VPTASVTE I QEKWHDSRRWQLS VATEQ PELEGPREG YGGFRGQ 

REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRSFSKAFGQ 


5790 


' 3786 


1585 


ARRQRDPLQAJ.RKRNQELKQQVDSL^SQLkSALEPNkRQHIY - 
QRCI QLKQAI DENKNALQKLS KADESAP VANVNQRKE EEHTLLD 
KLTQOLQGUVVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEEEKEENESHKWSTGEEYIAVGDFTAQQVGDLTPKKGEI 
LLVI EKKPDGW WIAKDAKGNEGLVPRTYLEPYSEEBEGQES SEE 
GSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAISEAGIFCLVN 
HVSFCYL I VLMRNRMETVEDTNGS ETGFRAWNVQSRGR I FI»VS K 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=»Isoleucine, X>Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V-VaXine, 
W-Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVI^lNTVUVLTl-MGAIPAGFRPSTLSQLLKh'GNCjFRANVFLQ"- 

PELMPSQIAFRDLMWDATEGTIRSRPSRISLILTLWSCKMIPLP 

GMSIQVLSRHVRLCLPDGNKVLSNIHTVRArrfQPKKPKTWTPSP 

QVTRILP CLLDGDCF I R SNSAS PDLG I L FELGIS YI RNSTGBRG 

ELS CGW VFLKLFDASGVPI PAXT YELFLNGGTPYE KG I EVDP S I 

SRRAHGSVFYQIMTMRRQPQLLVKIiRSIiNRRSRNVLSLLPETLI 

GNMCSIHLLIFYRQILGDVLLKDRMSLCSTDLISHPMLATFPMI4 

LEQPDVMDALRSSWAGQES\TIiKRSEKR\PKSFLKVPRFLLVYH 

XGCVliPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 

GALQALLSPDGVHEPFDLSEQTYDFIjGEMRKNAV 


5791 " 


3 


1636 


LRVAEFAGTS R/ IGAGLIQPLHRAPARDHGL jRGGAAPALSVSH 

gn/gkql/amssqgsddeqikrenirsltmsghvgfeslpdqlv 
nrs iqqgfcfnilcvgetgigkstlidtlfntnfedyesshfcp 

NVKLKAQTYEIiQESNVQLKLTIVNTVGFGDQINKEESYQPIVDY 
IDAQFEAYLQEELKIKRSLFTYHDSRIHVCLYFISPTGHSLKTL 
DL LTMKNLDS KVYI I P VI AKADTVS KTELQKFKIKLMS ELVSNG 
VQIYQFPTDDDTIAKVNAAMNGQLPFAWGSMDEVKVGNKMVKA 
RQ YPWGWQVENEKHCDF VKLREML I CTNMEDLREQTHTRHYEL 
YRRCKLEEMGFrDVGPENKPVSVQETYEAKRHEPHGBRQRKEEE 
MKQMFVQRVKEKEAILKEAERBLQAKFEHLKRLHQEBRMKLEEK 
RRLLEEE I I AFS KKKATSE I FHSQS FIATGSNLRKDKDRKNSQ F 
FVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICFITS IFGEQPQ 
LLI FMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAPSPAV»WCGVFVVYVVHTCWV^GIVYTRPCSGDASClQPY~ 
IARRPKI^L\RHS FTTTRSHLGAEJHN IDLVLNVEDFDVES KFER 
T VNVS VP KKTRNNGT h YA Y I FLHHAG VLPWHDGKQVHLVS PLTT 
YMVPKPEE I NLLTGE SDTQQI EADKKPTSALDE PVSHWR PRLAL 
NVMADNFVFDGSSLPADVHRYKKMI QLGKTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPIiTVS YDKVS LGRLRFVf I HMQDAVYSLQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
IS FWKKKKSMIGMSTKAVLWRCFS T WIFLFIiLDEQTSLLVLVP 
AG VGAAI ELWKVKKALKMT I FWRGLMPEFQFGT YS ESERKTE E Y 
DTQAMKYLS YLLYPLCVGGAVYS LLNI KYKS WYS WLINS FVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 
5794 * 


2263 


653 


AAAAPSPAWWCGVFVVYVVHTCWVMYGIVYTRPCSGDASCIQPY 
LARRPKLQL\RHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVS VP KKTRNNGTLYAY I FLHHAGVLP WHDGKQVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHMRPRLAL 
NVMADNFVFDGSS LPADVHRYMKMIQLGiCTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVS YDKVS LGRLRF W I HMQDAVYSLQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
I SFWKKKKSM IGMS TKAVLWRCFST WI FLFLLDEQTSIiLVLVP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTQAMICYLS YLL YPLCVGGAVYSLLNIKYKS WYS WLINS FVNGV 
YAFGPLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 




1 


5016 

( 

) 


MGPRLSVWLbLLPAALIiLHEEHSRAAAKGGCAGSGCGKCDCHGV " 
KGQKGE RGL PGLQGV IGFPGMQG PEGPQGPPGQKGDTGEPGLPG 
i MsiKfciA'i'GASGYPGNPGLPGI PGQDGPPGPPGIPGCNGTKGER 
GPLGP PGLPGFAGNPG PPGLPGMKGDPGE I LGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKG<2M 
GLSFQG PKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGBKGSPGFPGEPG 
YPGLIGRQGP\QGBKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGE PG PKGFPGL PGQPG PPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTS LPG PSGRDGLPGP PGS PGPPGQ PG YTNGI VEOQ PGP P 
3DQGPPGI PGQPGFI GEI GE KGQ KGESCL I CD IDGYRGPPGPQG 
PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK 
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ID 
NO; 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



£795 



1192 



5796 



-57T7- 



T798- 



$44 



5799 



1078 



891 



"TlT 



Amino acid segment containing signal peptide | 
<A-Alanine, OCysteine, D»Aspartic Acid, E=» 
Glutamic Acid, FePhenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L«Leucine, M-Methionine , N«Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\«possible nucleotide insert ion) 
CjEPGEFYPDLkltKGDKQDPG FPGQPGMPGRAGSPQRDGHPGLPG' 
PKGS PGS VG LKGERGP PGGVGFPGSRGDTGP PGP PG YG PAGP IG 
DKGQAGFPGG PGSPGL PGPKGE PGKI VPLPGPPGAEGLPGS PGF 
PGPQGDRGFPGTPGR\PGL\PGBKGAVG\QPGIGFPGPPGPKGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGI* 
PGLPG I PGTPGEKGS IG VPGVPGEHGAIGPPGLQG IRGE PGP PG 
LPGS VGSPG VPG IGP PGARGPPGGQG P PGLSGPPG IXGE KGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 
SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 
P3LKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGSKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGBKGVPGIPGPQGSPGIiPGDKGAKGEKGQ 
AGPPG IG I PGLRGEKGDQGIAGFPGS PGEKGEKGS IGIPGMPGS 
PGLKGS PGS VGYPGSPGLPGEKGDKGLPGUDGI PGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGSKGEVGFPGliAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 
GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 
GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGF 
QGPKGLPGLQGIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEP 
GLPGPEGPPGLKGIiQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFL 
VTRHSQTIDDPQCPSGTK1LYHGYSLLYVQGNERAHGQDLGTAG 
SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 
ITGKNIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWI 
GYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRGTCN 
YYANAYS FWLAT I ERS EMFKKPTPSTLKAGELRTHVSRCQVCMR 
RT 



1435 



STRSPTVEYXSAHPHILFMLLKGYEAPQIALRCGIMLRECIRHE" 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDTI FEDYEKLLQS2NYVTKRQSLKLLGELILDRHN 
FAI MT K YI SKPENLKLMMNLLRDKSPNI QFEAFHVFKVFVAS PH 
KTQPIVE I LLKNQPKLI EFLSS FQKERTDDEQFADEKNYLI KQI 
RDLKKTAP *RALRDSKR 

GRVG WEL WCMY ISP PKD WWDAGDPS LP IRTPAMIGCS FWNRKF 
FGEIGLLDPGMEJVYGGENIELGIKVWLCGGSMEVLPCSRVAHIE 
RKKKPYNSNI G FYTKRNALRVAEVWMDD YKSHVY IAWNLPLEKP 
G I D IGDVSER RALRKSLKCKNFQW YLDHVYPEMRRYNNTVAYGE 
LRNNKAKD VCLDQG PLENHTAILYPCHGWG PQLAR YTKEGFLHL 
GALGTTTLLP DTRCLVDNS KS RLPQLLDCDKVXS SL YKRWNFXQ 
NGArMNKGTGRCLEVENRGLAGIDLILRSCTGQRWTIKNSIK*R 

EGAGAIJEPGPQDMAAPPNIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPQ 

PR VRQKTLVD VTtiENSN I KDQ I RNLQQT YEASI4DKLRE K$RQLE 
VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 
KHSAE KEALLEETNS FLKAIEE ANKKMQAAE I SLEEKDQR I GEL 
DRIiIERMEKERHQLQLQLLEHETEMSGELTDSDKERYQQLEEAS 
ASLRERIRHLNDMVHCQQKXVKQMVEEIESLIOCKI^KQLLIIiQ 
LLEKISFLEGENNELQSRLDYLTETOAKTEVE?REIGVGCDLLP 
S QTGRTRE I VM PSRKYTP YTRVLBL TM KKTLT 
KIIiGSRWKSMSNQEKQPYYEEQARLS KIHLKKYPNYKYKfcRPKR 
TCIVDGKKLRIGSYKQLMRSRRQEMRQFFTVGQQPQIPITTGTG 
WYPGAI TMATTTPS PQMTSDCSS TSASPE PS LPVIQSTYGMKT 
DGGS LAGNEM I NGEDEMEMYDDYEDDPKSDYSS ENEAPEAVS AN 



LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVEIjQQRAVEYL 
TLSS VAS TD VLATVLEEMP PFPERES S I LAKLKR KKGPGAGS AL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPP I P 
EADBLLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVQFQNFS PTWHPGDLQTQLAVQTKR VAAQVDGGAQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQBAQKIFKANHPMDABVTKAKLLQFQSA 
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ID 
MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firot 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Ao Alanine, CsCvsteine, DoAsoartic Acid E= 
Glutamic Acid, F^Phenyl alanine, G«Glycine, 
H-Histidine, I«Isoleucine, KoLysine, 
L=Leucinc, MoMethionine, N*Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S»Serine, T=Threonine, v=valine, 
W»Tryptophan, Y» Tyrosine, X=unknovm, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVDPNPENFVGAGIIQTKALQVGCLLRl,EPNAQAQMyRLTi7" 
RTSKEPVSRHLCELIAQQF 


5800 


2673 


1435 


LLSTYIKFimjFPETKATIQGVLRAGSQLRNADVELQQRAVEYL 
TLSSVASTDVLATVLE2MPPFPERESS I LAKL KRKKG PGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PAS AGAGNLL VD VFDG PAAQ PSLGPTPEEAFL SPG PED IGP P I P 
EADELLNKF VCKNNG VLFENQLLQIGVKS E FRQNLGRMYI»FYGN 
KTSVQFQNFSPrWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
NIECLRDFIiTPPLLSVRFRYGGAPQALTLKIjPVTINKFPQPTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMDAEVTKAKLLGFGSA 
LLDNVD PN P EN FVG AG 1 1 QTKALQ VGCLLRLE PNAQAQ MYRLTL 
RTS KEP VSRHLCEIiLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITSIKINRVDPSEgLfiTPT 1 wir^brpT \>tjttt — 
QH I YRDGVI ARDGRLLPGDI I LKVNGMD I SNVPHN YAVRLLRQP 
OQVLMLTVMREQKFRSRNNGQAPDAYRPRDDS FHVI LNKS S PEE 
OLG I KLVRKVDEPG VP I FNVIiDGGVAYP HOOT ■ PPMno vt . » rxinu 

DLRYGSPBSAAHLIQASERRVHLWSRQVRQRSPDIFQEAGWNS 
NGSNSPGPGERSNTPKPLHPTITCHEKWNIQKDPGESLGMTVA 
GGASHREWDLPIYVISVEPGGVISRDGRIKI^GDILIiNVDGVELT 
BVSRSEAVALLKRTSSSIVLKALEVKEYEPQEDCSSPAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSU5FCIVG 
GYEEYNGNKPFFI KS I VEGTPAYNDGR IRCGDItiLAVNGRSTSG 
MIHACLARLLKELKGRITLTIVSWPGTFL 


5802 


3 


290 " " 


CFSLYOIMERIMDLPTLIjRHAFREMP9VfiRT»pl^ - TOT"pTTT rr m 
GAFFYLI S PliDFVPEALFGIIiGFLDDFFVI FLLL I Y I S IMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAEIYAYREEQDFGIEIVKVKAIGRQRFkVLEliR'rQSD 
GIQQAKVQILP2CVX,PSTMSAVQLESLNKCQIFPSKPVSREDQC 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
WD ENLKDDS L PSNP I DFS YRVAACLP IDDVLRIQLLKIGSAIQR 
LRCE LDI MNKCTSLCCKQCQETE I TTKNE I FS LSLCG PMAAYVN 
PHG YVHETIiTVYKACNLNLIGRPSTEHS WFPG YAWT VAQ CK I CA 
SH I G WKF TATKKDKS PQKFWGI»TRSALLPT1 PDTEDE IS PDKVI 
LCL " 


5804 


2 


1707 


EMEKQRQEEQRKRTEEERKRftilEQDMI^KRltiQREIjAKRAEQIE" 
DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 
REEKERIKYEEDKRIRYEEORPSLKEAKCLST.VMnnPTPQPaK'V 

ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDE ENQDTAKI FKG YR PGKLKLS FEEMERQ RREDE KR 
KAEEEAPJRRIEEEKKAFAEARRNMVVDDDSPEMYKTISQEFLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLBMEKQEFEQLRQEM 
GEEE EE NET FGLSRE YEEL I KLKRSGS I QAKNLKS KFE KIGQLS 
EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 
EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLR14QFEQRE1 
DAALQKKREEEEEEEGS IMNGSTAEDEEQTRSGAPWFKKPLKNT 
SVVDS E PVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYI ERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIBSKN 


5805 


3 


776 


YISDTLGQVYKSKlRWWIEENGGNGNISVDDIiIALIiDIiAEHASS" 
AFKES QQQSBDRE YE VKERLYPKS KRR YDTYNIAG YQGE I EVGL 
YTIQILQUPFFDNKNELSKRYMVNFVSGSSDIPGDPNNBYKLA 
LKNYI P YLTKLKFSLKKS FDFFDE YFVLLKPRNNI KQNEBAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGLGSKFSBPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFIjLK 


5806 


1257 


877 


AVFTFHNHGRTANIiYSLHSWLGITTYFIjFACQRFLGFAVFIiLPW 
ASMWLRSLLKPIHVFFGAAILSLSIASVISGINEKLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRP 


5807 


2267 


1302 


rfskktfrrpmavdiqpaclglycgktllfkngsteiygecgvc 
prgqrtnaqkycq pctes pelydwlylgfmamlplviihwff i e w 

YSGKKSSSALFQHITALFECSMAAIITLLVSDPVGVIjYIRSCRV 
LMLS DWYTML YNPS PDYVTTVHCTHEAVYPLYTI VF I YYAFCLV | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(JWUanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P«Proline, QoGlutamine, R«Arginine, 
S-Serine, T-Threonine, VoValine, 
W=Tryptophan, Y«Tyrosine, X=»unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMLltRpljLVjCKiA^GlKjIkSDRPKS IY AAL¥ FKPILTVLQAVGG 
GLLYYAFPYI ILVLS LVTLAVYMS ASB I ENCYDLL VR KKRLI VL 

FSHWLLHAYGIISISRVDKLEQDLPI*LALVPTPALPYLFTAKPT 
EPSRILSEGANGH 


5808 


2 


433 


SLPDSGWE^Ui^GCiVADNHKDFGELRYNECLMNFSCNGKNGSS 
EGR ITHGFQ LKSA YENNLMP YTNYTFD FKGVI D YI FYSKTHMNV 

LGVLGPLDPQWLVEKNITGCPHPHIPSDHPSuLTQLELHPPLLP 
LVNGVHLPNRR 


5809 


4*4 


2422 


ILVPGFQG 1 LHPGVYCALQS QHQAQELVAD I DECE VSGLCRKGG 
RCVNTHGSFECYCMDGYLPRNGPEPFHPTTDATSCTEIDCGTPP 
EVPDGYI IGN YTSSLGSQVRYACREGFFS VPEDTVSS CTGLGTW 
ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCQEGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSIjFNDTCVRWQ 
INSRRINPKISYVISIXGQRLDPMESVREETVNLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 

s i fnetclklnrrsrkvgsbhmyqftvlgqrwylanfshatsfn 
fttreqwwcldlypttdytvnvtllrspkrhsvqitiatppa 

VKQriSNISGFNETCLRWRSIKTADMEEMYLFHIWGQRWYQKEF 

aqemtfnissssrdpevcldlrpgtnynvslralssblpwisl 

TTQ I TEP PLPEVEFFTVHRGPLPRLRLRKAKEKNGP I SS YQVLV 
LPIiALQS TFS CDSEGAS S FTSNAS DATCYVAAEIiLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDS S LMLLQMAGVGLGSLAW1 ILTFLS FSAV 


5B10 


3 


1641 


KVFGTHKDHEVSTLDTAI S AVK VQLAEFLENIjQE KSLR I EAFVS 
E I ES FFNT I EENCS KNEKRLEEQNEEMMKKVLAQ YDE KAQSFEE 
VKKKKMEFLHEQMWFIjQSMDTAKDTL£TIVREAEELDEAVFLT 
£ FEEINERIjLSAMESTASLEKMPAAFSIjFEHYDDSSARSDQMLK 
QVAVPQPPRLEPQEPNSATSTT1AVYWSMNKEDVIDSFQVYCME 
BPQDDQEVNELVEEYRLTVKESYCIFEDLEPDRCYQVWVMAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 
TETYTLEYCRQHSPEGEGLRS FSG I KGLQLKVNLQPNDNYFFYV 
RAINAFS7S EQSEAALISTRGTRFLLLRETAHPALHISSSGTVJ 
SFGERRRLTEIPSVLGEELPSCGQHYWETTVTDCPAYRLGICSS 
SAVQAGAIjGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 
VGIiLDYNNQRLIFINAESEQLIiFI IRHRFNEG VHPAFALEKPG 
KCTLHLGIBPPDSVRHK 


5811 
""§812 


1918 


851 


AAAlJU)PLPEDKWSAEKRRPLKSSLGYEiTFSLLNPDPKSHD^ 
WDIEGAVRRYVQPFLNALGAAGNFSVDSQILYYAMLGVNPRFDS 
ASSSYYU3MHSLPHVINPVESRLGSSAASLYPVLNFIJUYVPELA 
HS PL YI QDKDGAP VATNAFHS PRWGG I MVYNVDS KTYNASVLPV 
RVBVDMVRVMEVFLAQLRliFGlAQPQLPPKCLLSGPTSEGLMT 
WELDRLLWARS VEN1ATATTTLTS LAQLLGKISNI VI KDD VASE 
VYKAVAAVQKS AEELASGHLASAFVAS QEAVTSS ELAFFD PSLL 

HLLYFPDDQKFAIYIPLFLPMAVPILLSLVKIFLETRKSWRKPE 
KTD 




5204 


2744 


GGRQRCQRGRS CGAREBE VE PGTARPP PAAS AMDASLE KI ADPT "" 
LAEMGKNLKEAVKMLEDSQRRTEEENGKKIjISGD IPGP30QGSGQ 
DMVS ILQLVQNLI4HGDEDEBPQSPRIQNIGEGGHMALIjGHSLGA 

yistldkeklrklttrilsdttlwlcri fryengcayfhbeere 

GLAKICRIjAIHSRYEDFVVDGFNVLYWKKPVIYLSAAARPGLGQ 

ylcnqlglpfpclcrvpcntvfgsqhqmdvaflekli kddierg 
rlplllvanagtaavghtdkigrlkelceqygiwlhvegvnlat 
lalgyvsssviiaaakcdsmtmtpgpwlglpavpavtlykhddpa 
ltlvagltsnkptdiojralpijwlslq ylgldgfver i khacqls 

QRLQESIiKKVNYIKILVBDELSSPWVFRFFQELPGSDPVFKAV 

pvpkmtpsgvgrerhs cdal1wiwlgbqlkqlvpasgltvmdlea 
egtclrfsplmtaavlgtrgedvdqlvaciesklpvlcctlqlr 
ebfkqeveatagllyvddpnwsgigwryehrnddksslksypq 
genihaglucklnelesdltfkigpeyksmksclyvgmasdnvh 
aaelvetiaatareiednsrllenmtewrkgiqeaqvelqkas 
eerlleegvlrqi p wgs vlnwfsp vqalqkgrtfnltagsles 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=»Phenylaianine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=*Leueine, M=Methionine, N^Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S-Sarine, T=Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKPIWYKAQGAGVTLPPTPSGSRTkOKLPGOKPPKKSLRG^DA 
LSKU-SSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


293S 


699 


HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
LIiRLI^FCVLLAGLCRGNSVERKIYIPLNKTAPCVRLLNATHQI 
GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMVLLESKHFT 
RDIiMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 
SNSYGPEFAHCREIQWNSLGNGLAYEDFSFPIFLLEDENETKVI 
KQ CYQDHNLSQMGSAPTFPLCAMQIjFSHMAWLS FSTAT\ CMRRS 
SIQSrFSINPKIVCDPLSDYNVWSMLKPINTTGTLKPDDRVWA 
ATRLDSRS FFWKTV\APGAES AVAS FVTQLAAAEAIjQKAPDVTTL 
PRNVMFVFFOGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEL 
GQVALRTSLELWMHTDPVSQKNESVRNQVEDLLATLEKSGAGVP 
AVILRRPNQSQPLPPSSLQRFLRARNISGWLADHSGAFHNKYY 
QSIYDTAENIOTSYPEWLEPLKE/ETWNFG*QDrAKALADVATV 
LGRALYEIJVGGTNFSDTVQADPQTVTRLLYG \ FLI KANNS WFQS 
ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VLQYALANL 
TGTVVNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKD1RARIFL 
IAS KELELI TLTVGFG I LI FSLI VTYCINAKADVLFIAPREPGA 
VSY 


5014 


8500 


432 

1 


AliKC^PRRVIAILVGPVQPDRMAEEGAVAVC^V^PLl^REESL - 
GBTAQVYWKTHNNVI YPVDGS KSFNFDRVLHGNETPKNVYEA\ I 
AAPIIDSAIQGYNGTIFA\YGQT\ASGKTYTMMGSEDHLGVIPQ 
GQFHGHFSQKI * EVFLDRE FLLRVSYMEI YNBTITDLLCGTQKM 
KPLI IRSDVNRNVYVADLTBEWYTSEMALKWITKGEKSRKYGE 
TKMNQRSSRSHT I FRMI LESREKGEPSNCEGS VKVSHLNLVDLA 
GSERAAQTGAAGVRLKEGCNINRSLFIIjGQVIKKLSDGQVGGFI 
NYRDS KLTR I LQNSI/3GNPXTRI I CT I TP VS FDE TLTAL Q FAST 

akymkntpyvnevstdeaulkryrkeimdlkkqleevsletraq 

AMEKDQIiAQLLEEKDI,I.QKVQNEKIEI3LTRMLVTSSSLTLQQ3L 

kakrkrrvtwclgkinkmknsi^adqfniptnittkthki^inl 

LREIDESVCSESDVFSNTLDTLSEIEPraJPATKLLNQENIBSELN 

slradydnlvldyeqlrtbkeemelklkekndldbfealerktk 

KDQEMQL I HE I SNLKNLVKHRE VYNQDLENELS S KVELLREKED 
QIKKLQEYIDSQKLENIKKDLS YSLES IEDPKQMKQTLFDAETV 
ALDAKRESAFLRSENLELKEKMKELATTYKQMBNDIQLYQSQLE 
AKKKMQVDLEKELQSAFNE ITKLiTSL I DGKVPKDLLCNLELEGK 
I TDLQKE LNKEVEENEALRBE V I LLSBLKSLPS BVERLRKE IQD 
KSEELHI ITSE1CDKLFSEVVHKESRVQGLLEEIGKTKDDLATTQ 
SNYKS TDQE FQN FKTLHMD FE Q KYKMVLEENERMNQE I VNLS KE 
AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKBQLE 
NRDSPLQWEREKTLITEKLQQTLEEVKTLTQEKDDLKQLQESL 
QIBRDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 
KI SEE VSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 
TADVKDrfEIIEQQRKIFSLIQEKNELQQMLESVIAEKEQLKTDL 
KENIEMTIENQEBLRLLGDELKKQQBIVAQEKNHAIKKEGELSR 
TCDRIiAEVEEKLKBKSQQLQBKQQQLLNVQEEMSEMQKKINB IE 
NLKNELKNKELTLEHMETERLBLAQKLNENYEEVKS ITKERKVL 
KELQKS FBT BRDHLRG Y I RB I EATGLQTKEELKI AH I HLKEHQE 
TIDELRRSVSEKTAQI INTQDLEKSHTKLQBEIPVLHEEQELLP 
n v A^v^is iyisiMWlSijbiiJjTEgSTTKDST^LAR IEMERLRLNEKF 
QESQEEIKSLTKERDWLKTIKBALE ViCHDQLKBH IRETLAK IQE 
SQSKQEQSLNMKEKDNETTKI VSEMEQFKPKDSALLR IEI EMLG 
LSKRLQESHDEMKSVAKEKDDLQRLQBVLQS ESDQLKENI KB IV 
RJOiLETEEELKVAHCCLKEQEETINELRVNLSEKETEISTIQKQ 
LBAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNBLKQFKBFR 
KAKDSALQSIESK^ELTNRLQESQEEIQIMIKEKEEMKRVQEA 
[iQIERDQLKENTKEIVAKMKESQBKEYQFLKMTAVNBTQEKMCE 
rEHLKEQFETQKLNLElJIETENIRiTQILHENLEEMRSVTKERD 
3IiRSVEETIiKVERIX)LKENLRETITRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, CoCysteine, D=Aspartic Acid, 
Glutamic Acid, P=Phenylalanine, G»Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Meth.i'onine, N=Asparagine , 
P*Proline, Q=Glutamine , R»Arginine, 
S=Serine, T»Threonine, V-»Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *»Stop 
Codon, /.possible nucleotide deletion, 
\«possible nucleotide insertion) 








QETIDKLRGIVSEKTNEISNMQKDLBHSNDALKAQDLktQEELR 
IAHMHLKEQQETIDKLRGIVSEKTDKLSNMQKDLENSNAKLQEK 
IQELKAMEHQLITLKIOVNETQKK7SEMEQLKKQIKDQSLTLSK 
LE I BNLNLAQKLHENL5EMKS VMKERDNLRRVEETLKLERDQLK 
ESLQETKARDLEIQQELKTARMLSKEKKETVDKLREKISEKTIQ 
ISDIQKDIJ)KSKDEI^KKIQELQKKEI^IdiRVKEI)VmSHKKIN 
EMEQLKKQFEPNYLCKCEMDNFQLTKKLHE5LEEIRIVAKERDE 
IiRRIKESI»KMERDQFIATLREMIARDRQNHQVKPEKRI*LSDGQQ 
HLMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVBKQKELLIK 
IQHLQQDCDVPSRELRDLKLNQNMDLHI EE ILKDFSESEFPS I K 
TE FQQVLSNRKEMTQ FLE EWLNTRFDI E KLKNGIQKENDRI CQV 
NNFFNNRI I AI MNESTE FEERSATI SKE WEQDLKSLKE KKEKLF 
KNYQTLKTSLASGAQVNPTTQDNKNPHVTSRATQLTTEKIRELE 
NS LHEAKE SAMHKE S KI I KMQKELEVTNDI 1 AKLQAKVHESNKC 
LEKTKETIQVLQDKVALGAKPYKEEIEDLKMKLGKIDLEKMKNA 
KEFEKEISATKATVBYQKEVIRLLRENURRSQQAQDTSVISEHT 
DPQPSNKPLTCGGGSG1VQNTKALILKSEHIRLBKEISKLKQQN 
ROLIKQKWELLSNNQHLSNEVKTWKERTLKREAHKQVTCENSPK 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 

LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 
DVP\ECKTQ 




23 


1460 


S EL VMWTVQNRES LGLLS F P VM T T M Vf*P AH <z tm ppqnmcy^ tvcv 
VDRUjKGYD IRLRPDFGG P PVDVGMRI DVAS IDMVSEVNMDYTL 
TMYFQQSWKDKRLSYSGI PLNLTLDNRVADQLMVPDTYFLNDKK 
SFVHGVTVKNRMIRI^PIX5TVI,YGLRITTTAACMMDLRRYPLDE 
CNCTLEIESYGYTTDDIEFYWNGGEGAVTGVNKIELPQFSIVDY 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
SWVS FWINYDAS AARVALG I TTVLTMTTISTHLRETLPKI P YVK 
AIDIYL^CFVFVFIiALLEYAFW5fIFFGKGPQKKGASKQD0SA 
NEKNKLEMNKVQVDAHGNILLSTLEIRNETSGSEVLTSVSDPKA 
TMYSYDSAS I Q YR KPLS S RE \A*GRAPDRHG VPS KGR I RRRAS \ 
QLKVKI PDLTDVNS IDKWSRMFFP IT FSLFNWYWLYYVH 


5816 


861 


131 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 

TVYHERQRLELCAVHALNNVLQQQLFSQEAADEICKRLAPDSRL 

NPffi^lJ^TGl^DVNVIMAAIiQGLGIJUlVWWDRRRPI^ 

VI^LILNLPSPVSLGLLSLPLRRRKLRWPCSUUi/VTVSYYNLDS 

K\ LRAPEGPGGLRTE\ +G PFLAAALAQGLCEVLLVVTKEVEE KG 

SWLRTD 


5817 


851 


118 


rlfrgpganrgrscsg£sggrepsgga1iPkrhcpc*ppsppaad 
vmsnttvpnapqansdsmvgyvlgpfflitlvgvwawmyvqk 
kkrvdrlrhhllpmysydpaeelheaeqellsdmgdpkw\qag 
rvatstsgcwcwpisrrdlitpi^phpsbpgvldclgpchliipllsp 
gspcwvlglhfslhppsaasashaltitslppgllpfvgvelta 

HPQALMGRGF PS GMAAAGRHLC FI> | 


5818 


3 


3918 


QAI^PKLWIFLVQSFYAVRHTESWK^ - 
DRRLGKKP I FSS SQQRKQVSDSGDIKIKS WRGNNKKECWSYLST 
NKKMKS DGLGASGHS SSTNRNS INKTLKQDDVKEKDGTKIAS KI 
TKELKTGGKNVSGKPKTVTKSKTENGDKARLENMSPRQVVERSA 
TAAAAATGQ KNLLNG KG VRNQEGQI SGARP KVLTGNLNVQAKAX 
PLKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKEWGS 
TEEEKPSGHKLS FCDSPGQMMKNS VDSVKNSTVAI KSRPVSRVT 
NGTSNKKS I HEQDTNVNNS VLKKVSGRGCS EPVPQAI LKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSVKSSVSSRQSDENV 
AKLDHNTTTEKQAP KKKMVKQVHTALPKVNAKIVAM PKNLNQS K 
KGETLNNKDS KQKM P PGQVI S KTQPS SQRPLKHET3TVQKSM FH 
DVRDNNNKDSVSEQKPHKPLINIiASEISDAEALQSSCRP\DPQK 
PLNDQEKEKIiAIiECQNlSKLDKSLKHELESKQICLDKSETKFPN 
H KE TDD CDAANI CCHS VGS DNVNS KFYS TT ALKYMVS N PNENS I> 
NSNPVCDLDSrSAGQIHLISDRENQVGRKDTNXQSSIKCVSDVS 
LCNP ERTNGTLNSAQEDKKSKVPVEGLTI PS KLSDES AMDEDKH 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C=Cysteine, DeAspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, GsGlycine, 
H=Histidine, X»Isoleueino, K-Lysine, 
L=Leucine, M=Methionine, NoAsparagine , 
P-Proline, Q=Glutamine, R*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, x=unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGOi^EKNSPKNMBTSESPESHETPETPPVGH 
WNLSTGVLHQRESPESDTGSATTSSDDIKPRSEDYDAGGSQDDD 
GSNDRGISKCGTMLCHDFLGRSSSDTSTPEELKIYDSNLRIBVK 
MKXQSSNDLFQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 
GS VQFAQE I DQVSS SADETEDERS EAENVAENFS I SNPAPQQFQ 
GI INLAFEDATENECREFSANKKFKRSVLLSVDECBELGSDEGE 
VHT P FQAS VDSFS PSDVFDG I S HEHHGRTCYSRFSRESEDNI LE 
CKQNKGNSVCKMESTVLDLSS I DS S RKNKQ S VS ATE KKNT I D VL 
S S R S RQ LLRED KKVNNGSNVENDI QQRS KFLDSDVKS QERPCHL 
DLHQRE P N S DI P KNS S TKSLDS FRS Q VL PQEGP VKESHSTTTE K 
AN I ALS AG D I DDCDTLAQTRM YDHR P S KTLS P I YEMDVI EAFEQ 
KVES3 PHVTDMDF* DDQH FAKQDWTLLKQLLSEQDSNLDVTNS V 
PEDLS LAQ YL INQTLLLARDS S KPQGITH I DTLNRWSELTSPLD 
SS AS I TMAS FSS EDCSP QGE WT I LE LETQH 


5819 


l 


5557 


AAAGL LGAfcHL VMT L WAAARAE KE AFVQS ES 1 1 EVLRFDDGGL 
LQTETTLGLSS YQQKS I S L YRGNCR PI R FE PPMLDFHEQP VGM P 
KMEKVYLHNPS S E * T I TLVSI FATTSHFHAS FFQNRKI LPGGNT 
SFDVS/VFI^V\raNVEOTLFira , SNHGVFTY\QVFGVGVPNPY 
RLR PFLGARVTVNSS FS P I IN IHNPHS EPLQWEMYSSGGDLHL 
ELPTGQQGGTRJOjWBIPPYETKGVMRASFSSREAZMWTAFIRIK 
TNASDSTEFI ILPVEVEVTTAPGI YSSTEMLDFGTLRTQDLPKV 
LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKP I TLKAS \esk 
YTKVASISFDASKAKKPSQFSGK1TVKAKEKSYSKLEIPYQAEV 
LDG YLGFDHAATLFHI RDS PADPVER P I YLTNTFS FAIL I HDVL 
LPEEAKTMFKVHNFSKPVLILPNESGYIFTLLFMPSTSSMHrDN 
N I LL I TNAS KFHLPVRVYTGFIiDYFVL PPK1 EERF ID FGVLSAT 
EASNILFAI INSNPIELAIKSWHIIGDG\LS2ELVAVDRGNRTT 
IISSLPECEKSSSSDQSSVTLASGYF\AVFRVKLTAKKL\EGIH 
DGAIQITTDYEILTIPVK\AVIAVGSLTCSPKHWLPPSFPGKI 
VHQSLNIMNSFSQKVKIQQIRSLSBDVRFYYXRLRGNKEDLEPG 
KKS KIANI YFD PGLQCGDHCYVGLPFLS KSEPKVQPG VAMQEDM 
WDADWDLHQSLFKGWTGI KENSGHRLSAI FEVNTDLQKNI I SKI 
TABLSWPSILSSPRHLKFPLTNTNCSS\EEEITLENP/SQDVPV 
YVQFI PLALYSNPS VFVDKLVSRFNLS KVAKI DLRTLEFQVFRN 
S AHPLQSSTGFMEG\LS PHLI LNL I LKPGEKXS VKVK\ FTP VHN 
RTVSSLI I VRNNLTVMDAVMVQGQGTTENLRVAGKLPGPGS SLR 
FKITEALLKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIE 
ISG YS CEG YG FKWNCQEFTLS ANASRD I 1 1 LFT P D FTAS RVI R 
ELKFITTSGSEFVFILNASLPYHMLATCAEALPRPNWEIALYII 
ISG IMSALFLLVIGTA\ YLEAQGI WE P\ FRRRLS \ FEASNPPFD 
VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGFCGAGGSSSRPSA 
GSHKQ* GPSGHPHSSHSNRNSADVDDVRAYNSGRTSSMTSAOAA 
SSCPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 
PLEQH PQP PLPPP VPQ PQEPQPERLS PAPLAHPSHPERASSARH 
SSEDSDITSLIEAMDKDFDHHDSPALBVFTEQPPSPLPKSKGKG 
KPLCRKVK P P KKQ EE KEKKGKGKPQEDE L KDS LADDDS S STTTB 
TSNPDTEPLLKEDTEKQKGKQAMPEKKESEMSQVKQKSKKLLNI 
KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 
S RNAQ KT KG TSKL VDNRPP ALAKFLPNSQE LGNTS S S EGEKDSP 
PPEWDS VPVHKPGS STDSL YKLSLQTXKAD I FLKQRQTS PTPAS 
PS PPAAPCP FVARGS YSS IVNSSSSSDPKI KQPNGS KHKLTKAA 

HAP VDS DG S DSSGLW S P VSNPS S PDFT PLNS FS AFGNS FNLTGE 
VFS KLGLSRS CNQAS QRS WNE FNS GPS YL WES PATDPS P S WPAS 
S GS PTHTATS VLGNT SGLWST TP FSSS I WSSNLS SAL P FT TPAN 
TLASIGLMGTENSPAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 
RSSDPWSNSHFPHEN 


5820 


310 


1270 " 


RVSLSGPVSLGVLLCARSSTMGKRDNRVAYMiJPIAMARSRGPIQ 
S SGPTIQ\ VI * I DQGLPGKK* KSN * KRKR K/DS KALAE FEEKMN 
ENWKKEL E KHREKLL SGS ESS S KKRQ RKXKEXKKS W * \DSSSS\ 
SSSSDSSSSSSDSEDEDKJCQGKRRKKKKNRSHXSSESSKSETBS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, CeCysteine, D«Aspartic Acid, E- 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
E=Histidine, I»Isoleucine, KoLyeine, 
L'Leucine, Methionine, N=Asparagine, 
P»Proline, QoGlutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
""Tryptophan, Y^Tyrosine, X^UnJcnown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLKKiOCKSKDGTEKEKDIKGLSKKRKMYSEDKPLSSESLS 
ES E YI EEVRAKKKKfl SE BREKATEKTKKKKKHKKHSKK KKKKAA 
SSS PDS P *H* EKSGFP YKESAMS EE I S TVKTTTYIiLKCMNFLVF 
GIIPGXiFSSHSDATV 


5821 


179 


915 


KWKNQSWRWPKPGTNWMISCSVCWRRVtW^ 
PT/IKDCSIAATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 
TYVIKLFDRSVDLAO.FSENTPLYPI CRAWMRNS PSVRERECS PS 
SPLPPL PEDEEG\SEVTNSKSR* CVQACPPTHTPGGQPKNACR\ 
SRIPSPLAALRMQGTP*RWSPFEPEPSPSTLIYRNMQRWKRIRQ 
RWKEASHRNQLRYSESMKILREMYERQ 


5822 


4S4 


4373 


QTLKEMPIVMARDLEETA^SSEDEEVISQEDHPCIMWTGGtlRRi ' 
PVLVFHADA I LTKDNN1 R VIG ERYHLS YKI VRTDSRLVRS I LTA 
HG FHEVHPS STD YNLMWTGS HLKP FLLRTLS EAQKVNHF PRS YB 
LTRKDRLYKN 1 1 RMQHTHG FKAFH I LPQTFLLPABYAEFCNS YS 
KDRGPWIVKPVASSRGRG\VYLINNPNQISLEENILVSRYINNP 
LL I DDF XFDVRL YVLVTS YDPLVI YL YEEGLAR FATVR YDQG AK 
NI RNQ FMHLTN YSVNKKSGD Y VS CDDP3VEDYGNKWSMS AMLRY 
LKQEGRDTTALMAHVEDL 1 1 KTI I SAELAIATACKTFVPHRS S C 
FELYGFDVL IDSTLKPKLLEVNLSPSLACDAPLDLKIKASMI SD 
MFTWGFVOQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
SDAEM KNLVGSARE KGPGKLGGS VLGLSMEE I KVLRRVKE ENDR 
RGGFIRI FPTS ETWE X YGS YLEHKTS MNYMLATRLFQDRNTADG 
APELKI ♦ S LKS KAKLHAALYERKLLS LEVRKRRRRS SRLRAMRP 
KYPVITQPAEMNVKTETBSEEEEEVAIiDNEDEEQEASQEBSAGF 
LRENQAKYTPSLTALVEOTPKENSMKVREWNKKGGHCCKLETQE 
LEPK FNLMQI LQDNGNLS KMQARIAFSAYLQHVQI \RLMKDSGG 
QTFSASWAAKEDEQMELVVRFLKRASNNLQHSLRMVliPSRRLAL 
LERTRILAHQLGDFI I VYNKETEQMABKKSKKKVEEEEEDGVNM 
ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN 
NYSDSGAKGDHPETIMEEVKIKPPKfVinTTRTUGnvT cdi?*twt»o a, 

EKEAKLVYSNSSSGPTATLQKI PNTHLSSVTTSDLSPGPCHHSS 
LSQIPSAIPSMPHQPTILIiNTVSASASPCLHPGAQNIPSPTGLP 
RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYLNKHHS 
G IAKTQKEGEOASLYSKRYNQSMVTAELQRLAE KQAARQYS PSS 
HINLLTQQVTNLNLATGI INRSSASAPPTLRPI ISPSGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
K YHPT AGS YQLQ FALQQLEQQKLQSRQLLDQS RARHQAI FGSQT 
LPNSNLWTMNNGAGCR I SS ATASGQKPTTLPQKWP P PSSCASL 
VPKPP PNHEQVLRRATSQKASKGSSAEGQLNGLQSSLNPAAFVP 
ITSSTDPAHTKIMNHKHTEKQPVHHSWVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DJUSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESP PAWS PLAGE KF VE VYKEAHLLALH I ES S S RNQ AAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS P LLG P P VGEPRLLAS S PALPS SGAQARLTRAPG PPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
E I PAS PS RTKI PAE KES HRD VL P D KP APGA VNVPAAGS HLGQG K 
RAIPVP\NKLGLKKTLLKAPGSYSN\ljQRKSSSGA\VWSGASSA 
CTPQPVAKAKS SEFASI PAN * LPGLCPNI SKS \GRMGPAMLRPA 

l\pagpvg \asswqakrvdvs elaaeqltapp\sasptqpqtpe 
ggg\qwlnsscawsessqlnktrsirrrdsclmsktkvmptptn 
q fki pkfs igds \ p ds s tp kls raqrpqs cts vgrvt vhstp vr 

RSSGPAPQSLLSAMRVSALPTPASRRCSGLPPMTPKTMPRAVGS 

pl\cvparrrsseprknsamrteptresnrktdsr\lvdvspdr 
g s ppsrvpqalnfs p ees dstfs ks tate varee akpggdaaps 
eallvdikleplavtpdaasqplidlplidfcdtpeahvavgsb 
srpl i dlmtntpdmnknvakps p wgql i dlss pi. i qt £ peadk 
envdspllkf 


5824 


42 


2293 


LLTALS M EGGGGRDE PS ACRAGD VNMDDPKKED I LIjILADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSES PFAWS PLAGElCFVEVYKBAHliLAl^ IES SSRNQAAQAAKP 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"! 
(A«Alanine, C» Cysteine, D»Aspartic Acid, E*» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M-Methionine, N-Asparagine , 
P=Proline, Q«Glucaraine, R=Arginine, 
S«Serine, T«Threonine, V»Valine, 1 
W«Tryptophan, YsTyrosine, X»Unknown, *=stop I 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ! 








EDPRSQGVERymKSKF\KlNLFEKEKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGIPGAAEKPKK 
E I PASPSRTKI PAE KESHRDVLPDKPAPGAVNVPAAGS HLGQGK 
RAI PVP \NKLGLKKTLLKAPGS YSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS IPAN* LPGLCPNIS KS \GRMGPAMLRPA 
L \ PAGFVG \ASS WQAKRVDVS ELAAEQLTAPP \SAS P TQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKXPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSXSTATEVAREEAKPGGDAAPS 
BALLVDIKLEPLAVTPDAASQPLIDIjPLIDFCDTPEAHVAVGSE 
SR PL IDLMTNTPDMNKNVAKPS P WGQLIDLS S PL I QL S PEAD K 
ENVDSPLLKF * ( 


5825 


2 


4210 


kLQIESASPAPFSSGFLAAHPHSPGGSLATKGHSRLSAPGMLHL 1 
SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGALERSFVEIi 

sgaererprhfreftvcs igtanavagavkysesaggfyyvesg 

KLFS VTRNRFIHWKTSGDTLBIiMEES LD I NLLNNAIRLKFQNCS 

vlpggvyvsetqnrviilmltnqtvhrlllphpsrmyrselwd 
sqmqsiftdigkvdftdpcnyqlipavpgispnstastawlssd 
gealfalpcasggifvlklppydipgmvswelkqssvmqrllt 

GWMPTA2RGDQSPSDRPLSLAVHCVEHDAFI FALCQDHKLRMWS 
YKEQMCLMVADMLEWPVKKDLRI.TAGTG3IKLRLAYSPTMGLYL 
GIF\MHAPKRGQFC1FQLVSTESNRYSLDHISSLFTSQETLIDF 
ALTSTDIWALWHDAENOTVVKYINFEHNVAGQWNPVFMQPLPEE 
E2VIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
DLS WS ELKKEVTLAVENELQGS VTEYEFSQEEFRNLQQEFWCKF 
YACCLQYQEAI^HPLAIJiliNPHTNMVCIiLKKGYLSFLIPSSLVD 
HLYLLP YENLLTEDETT I SDDVDI ARDVI CLI KCLRLI EES VTV 
DMS VI MEMS CYNLQSPBKAAEQ 1 LEDMITI DVENVMED I CS KLQ 
EIRNP IHAIGLL I REMD YETE VEMEKGFNPAQ PLNIRMNLTQLY 
GSNTAG YI VCRGVHKIAS TRFLICRDLLILQQLLMRLGDAV I WG 
TGQLFQAQQDLLHRTAPLLLS YYL I KWGSE CLATDVPIiDTLESN 
LQHLS VLELTDSGALMANRFVSS PQT I VELFFQE VARKH 1 1 SHL 
FSQPKAPLSOTGLNWPEMITAITSYIiLQIiLWPSNPGCLFLECLM 
GNCQYVQLiQDYIQLIjHPWCQVNVGSCRFMLGRCYLVTGEGQKAL 
ECFCQAASEVGKEEFLDRLIRSEDGEIVSTPRI^YYDKVLRLLD 
VIGLPELVIQLATSAITBASDDW\KSQATLNRTCIFKHHL\DLG 
\HNSQAYGSL * PQI PDSSRQLDCLRQLVWLCERSQLQDLVEFS 
YVNLHNE WGI IESRARAVDLMTHNYYELLYAFHI YRHNYRKAG 
TVMFE YGMRLGRE VRTI^GLEKQGNC YLAALNCLRLI RPEYAWI 
VQPVSGAVYDRPGASPKRUHDGECTAAPTNRQIEIIiELEDLEKB 
CSLARIRLIIjAQHD PSAVAVAGSSSAEEMVTLLVQAGLFDTA IS 
LCQTFKLPLTPVFEGLAFKCIKIX3FGGEAAQAEAWAWLAANQLS 
S VITTKESS ATDEAWRLLS T YLERY KVQNNLYHHCVINKLLSHG 
VPLPNWLINSYKKVDAAELI*RLYLNYDLIiDLTPYQVrRICGC | 


5826 


3 


871 


ksqllrdhsapppkpctsvgamgc*prq/spkeo^rqlkkqknrH 

AAAQRSRQKHTDKADAIJHQQHESLEKDNLAIiRKEIQSI^OAELAW 

WSRTIiHVHERiCPMDCASCSAPGIiI^WDOJ^ 1 

CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 

AWAEPPVQLSPSPLLFASHTGSSLQGSSSKLSALQPSLTAQTA 

PPQPLBLEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 

WQGLWDPSPHPIiLAFPLLSSAQVHF 


5827 


194 


2287 


GMGS ENS ALKS YTLRE PPFTLPS GLAVYPAVLQDGKFAS VFVYK ( 
RENEDKVmGWCVP**HIJ<TLRHPCLLRFIiSCTVEAIX3lHLVTE 
RVQPLEVALETLSS AE VCAG I YD XLLALI FLHDRGHLTHNN VCL 
S SVF VSBDGHWKLGGMETVCKVSQATPEFLRSIQS I RDPAS I P P 
EEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
SFQQTLHSTLLNPIPKWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKSEEB KTE FFKFLLDRVSCLSEELIAS RLVPLLLNQLVFAEP | 
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ID 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptidp - 
(A=Alanine, C»Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, Phenylalanine , G^Glycine, 
HsHistidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S«Serine, T«Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X^Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\KSFLPYLLGPKKDHAWETPCLLSPALFQSRVI PVLLflLF 
EVHEEHVRM VLLSH 1 KAYVGAI>SLREQLKKV\ 1 L \ PQVLLG \ LR 
D\ TS DS I VAI T LHSLAVLVS LLGPEVWGGERTK I FKRTAP\S F 
TK\NTDLSLEGDPFSQPI KFP INGLSDVKNTSEDS ENFPSSSKK 
SEEWPDWSGPE \EPENQTVNI \QI WP\REP\CDDVKSQCTTLDV 
BESSWDDCEPSSLDTKVNPGGGITATKPVTSGEQKP I PAUbSLT 
BESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSBL 
GIiGEEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 
EMVPKKDDVSPVMQFSSKFAAAEITEGEAEGWEEEGELNWEDNN 
W 


5828 


2 




AREGGSLGAVAACGELS YSCDFCPARPHTS WLTRFVKMEFQAW 
MAVGGGSRMTDLTSS I PKPLL P VGNKFL I W YPLNLLERVGFEEV 
I WTTRD VQ KALCAE FKMKMKPDI VCI PDDADMGTADS LRYI YP 
KLKTDVLVLS CDLI TDVALHB WDL FRAYDAS LAMLMR KGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GS ILQKHPRIRFHTGLVDAHLYCLKKY I VDFLMENG\SITS1RS 
BL\ I P YLV/RGKQFS S AS SQQGTRKBKEGGS KGKRGLKS FR IS Y 
SFY*KEANYTGTGAPY\D\ACWI 


5B2S 


260 


1259 


FDGRL I VS CSEDKT I KI WDTTNKQCVNNFSDS VGFANFVDFNPS 
GTCIASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNCISFHPSGN 
YL X TASSDGTLKI LDLLKGRLI YTLQGHTGPVFT VS FSKGGELF 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRLHFDS PPHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR*SICRSIiI»PLLWISF 
LLI L PQQQKPWGLCQTR VKRP VD I S *TLP*CHQNVCQQPRKRK 
QKT* VTSPVKVK/ VS I PLAVTDALEHIMEQLNVLTQTVS I LEQR 
LTLTEDKLKDCLENQQKLFSAVQQKfl 


5830 


4496 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGI ESMDQCRKTIiEQHNW 
NIEAAVQDRIiNEQEGVPSVFNPPPSRPLQVWTADHRIYSYWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGD I VS FMHS FEEKYGRAHPVFYQGTYSQALNDAKRELRFI* 
LVYLHGDDHQDSDEFCRNTLCAPEVISLINTRMLFWACSTKKPE 
GYRVSQALRENTYPFLAMIMLKDRRE*PV\ VGRLEGLI \QPDDL 
INQLTF IMDANQT YLVS E RLEREERNQTQVLRQQQDEAYLAS LR 
ADQEKERKKRBERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LE CLP PEPS PDDPESVKI IFKLPNDSRVERRFHFSQSLTVIHDF 
IiFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLSHTE VLFVQ DLTDE 


5831 


71 


2897 


FCSKDKCCLYLPDSINRSKSCrAKPGAHSQDRHAVMDS^RQVKD 
TDDIESPKRS IRDSGYIDCWDSERSDSLSPPRHGRDDS FDSLDS 
FGSRSRQTPS PD WLRGS SDGRGSDS ESDLPHRKL PDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRXKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDAESTSMFDMRC3E 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDLIKI03EEKKKMEKLLAGBDGTSERRKSIKTYREIVQEKERRE 
RELHEAY KNARSQE EAEG I LQQY I ERFTISBAVLERLEMPK I LE 
RSHS TE PNLSS FLNDPKPMK YLRQQSLPPPKFTATVETT I ARAS 
VLDTSMS AGSGS PS KTVTPKAVPMLTPKPYSQPKNS QDVLKTFK 
VDGKVS VNGET VHREEE KERECPTVAPAHSLTKS QMFEGVARVH 
GSPLELKQDNGSIEINIKKPNSVPQELAATTBKTEPNSQBDKND 
GGKSRKGNIELASSEPQHFTTTVTRCSPTVAFVEFPSSPQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YEEE P * 1 1 \ EDP WPFTVS SSSADQLSTS SSMTEGS GTMNKI DL 
GNOQDE KQDRR WK KS FQGDDS DLLLKTRESDRLEEKGSLTEGAI* 
AHSGNP VS KGVHEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSSED VKPKTL PLDKS INHQ I ES PS ERRKS I SGKKLCSS CGL 
PI^KGAAMIIErrLNLYFHIQCFRCG\lCKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRSRSAGQPTTL 


5832 


2454 


829 


PGRRFRHGSCAFQKQCI MLH I CQ YFLQGECKFGTS C KRSED FSN " 
SENI^KLEKLGMSSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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nucleotide 
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to first 
amino acid 
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amino acid 
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Predicted end 
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location 
corresponding 
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Ammo acxd segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H«=Histidine, I=*Ieoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine , 
P= Proline, Q«Glutamine, R«Arginine, 
S^Serine, T=Threonine, VoValine, 
W-Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /=poasible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDS SGS VS PNTLS QE EGDQ I CLYH IRKS CS FQDKCH 
RVH FHLP X KWQFLDRGKWEDLDNMELI EE AYCNPKI BR I LCSE S 
ASTFHSHGUNFNAMTYGATQARRLSTASSVTKPPHFILTTDWIW 
YWSDE FGS WQE YGRQGTVHPVTTVSSSDVEKAYLAY / WYTG V* R 
PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN*KGLPQTQIR\AP 
QDVTTMQTCNTKFPGPKSIPDYWD5SALPDPGFQKITLSSSSEE 
YQKVWNLFNRTLP FYFVQKI ERVQNLALWEVYOWQKGQMQKQNG 
GKAVDERQLFHGTSAI FVDAI CQQNFDWRVCGVHGTS YGKGS YF 
ARDAAYSHH YS KS DTQTHTMFLARVLVGE FVRGNAS FVR P PAKE 
GWSNAFYDSCVNSVS3>PSIFVIFEKHQVYPBYVIQYTTSSKPSV 
TPS I LLALGS LFSSRQ 


5833 


170 


3289 


SILCLLSPCWQFGKPWSILSSRSRHSPCTKKGWEGMRKHLHT 
RQGHK* VHVE I SKALW VYRDDYFI RHS IS VS AVI VRAWITHKYR 
GRDWNVKWEENLLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
GYIWMLRANRIPQCPLENDVVALLGFPYASSGKNTGIVKKFPRF 
RNRELEATRRQRMDYPVFTVSLWLYLLHYCKANLCGILYFVDSN 
EM YGTP S VFLTBEGYLHI QMHLVKGEDLAVKTKF 1 1 P LKEW FRL 
DI S FNGGQ I WTTS IGQDLKS YHNQTISFREDFH YNDTAG YFI I 
GGSRYVAG I EGFFGPLKYYRLRSLHPAQI FNPLLEKQLAEQ I KL 
Y YERCAE VQE I VS VYASAAKHGGERQEACHLHWS YLDLQRR YGR 
PSMCRAFPWEKELKDKHPSLFQAIiLEMDLLTVPRNQNESVSEIG 
GKI FE KAVKRLS S IDGLHQI S S I VP FLTDSS CCG YHKAS YYLAV 
FYETGI*NWRDQLQGMLYSLVGGQGSERLSSMNIX3YKHYQGIDN 
YPIiDWELSYAYYSNIATKTPLDQHTIjQGDQAYVETI RT iKDDE I I* 
KVQTK^DGDVFMWI^EATRGNAAAQQRIAQ^u^FWGQQGVAi<K^P 
EAAIEWYAKGALETEDPALIYDYAIVLFKGQGVKKNRRLALELM 
KKAASKGLEQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE\MGN 
PIXAS YNIX3 VIjHIiDG I FPG VPGRNQTLAGEYFHKAAQGGHMEGTI* 
WCSLYYITGNIiETFPRDPEKAVVWAKHVAEKNGYLGHVIRKGLN 
AYLEGS WHEALL YYVLAAETGIEVSQT!NIiAHI CEERPDIiARRYL 
GVNCVWRY YN FS V FQ I DAPS FAYI>KMGDLYYYGHQNQSQDLEIiS 
VQMYAQAALDGDSQGFFNLALLI EEGTI IPHHILDFLEI DSTUI 
SNNISILQELYERCWSHSNEBSFSPCSLAWLYIiHLRLLWGAILH 
SAL I YFLGT FLLS I L IAWT VQ YFQSVSASDPPPRPSQAS PDTAT 
STASPAVTPAADASDQDQPTVTNNPEPRG 


5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/'QECG" 

SAAPGPIPGQSSS*VPUII J BQIQQKADCPLSLELAI»KPRMAAQV 

TLEDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 

NAFVTG I ARY I EQATVHS SMNEMLEEGQEYAVMLYTWRSCSRA I 

PQVKCWEQPNRVE I YEKTVEVLEPEVTKIMNFMYFQRNAIERFC 

GEVRRLCHAERRKDFVSEAYLITLGI07INMFAVLDELKNMKCSV 

KNDHSAYKRAAQF1*RKMADPQS IQESQNLSMFLANHNKI TQSLQ 

WLBVI SGYEBLLADIVNLCVDYYENRMYLTPSEKHMLLKVMGF 

GLYLMDGS VSNI YKLDAKKR INLS KID KYFKQLQ WPLPGDMQ I 

EIJ^YIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIREDHM 

RFISELARYSNSEVVTGSGRQEAQKTDAEYRKLFDIiALQGIiQLIj 

SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 

EEKFALVEVIAMIKGLQVLMGRMESVFNHAIRHTVYAALQDFSQ 

VTLMEPLRQAI KKKKT^IQSVIjQAIRKTVCIDWETGHEPFNDPAIj 

RGE KD P KS G* D I KVPRRAVG PS STQIiYMVRTMLESL IADKSGS K 

KTLRS S LEGPT XLD I EKFHRES FFYTHL INFSETLQQCCDLSQIj 

WFREFFLELTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVIi 

YSLDLYNDSAHYALTRFNKQFLYDEIEA2VNLCFDQFVYKLADQ 

IFAYYKVMAGSLLLDKRLRSECKNQGATIHLPPSNRYETLLKQR 

HVQLLGRS IDLNRLI TQRVSAAMYKSLEIiAIGRFESEDLTS IVE 

LDGLLE INRP4THKLLSRYLTLDGFDAMFREANHNVSAP YGR ITL 

HVFWELNYDFLPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 

QYLHGSKALNIiAYSSIYGSYRNFVGPPHFQVICRLLGYQGIAVV 

^ELLKWKSLLO^TILQYVKTLMEVMPKICRLPRHEYGSPGII* 

EFFHHQLKDIVEYAELKTVCFQNljREVCINAILFCLLIEQSIiSLE 

EVCDLLHAAPFQNILPRVHVKBGERLDAKMKRIiESKYAPIjHLVP 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D*Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M«Methionine, N-Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








lJbr)lgtpqq1aiaregdlltkerl(!:cgijSmpeviltrirsfld — 

DP I WRG PLPSNGVMHVDE CVEFHRLWSAMQFVYC I PV3THE FTV 
EQCFGDGLHWAGCMIIVLLGQQRRPAVLDFCYHLLKVQKHDGKD 
EI IKNVPLKKMVERIRKFQI LNDE I ITI LDKYLKSGDGEGTPVE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGNI RMAQGSHQ I D FQ VLHDLRQK FPE VPEVWS RCMLQKNNN L 
DACCAVLSQBSTRYLYGEGDLNFSDDSGISGLRNHMTSLNLDLQ 
SaNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNP I MVTLAPNI QTGRNTPTSLHI HGVP PP VLNS PQGNS I Y I 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASNPLS HTSSQQPNQQGHQTSHVYMP I S S PTTSQP PTIHS SGS 
SQSS AHSQYNI ON I S TGPRKNOI E I KLE PPORNN <? <? kt ,n c; c r Dt> 
TSSTSS S VN5QTLNRNQPT V YI AAS P PNTDBLMS RSQPKV Y I S A 
NAATGDEQ VMRNQPTLF I S TNS GAS AAS RNMSGQ VSMGPAF IHH 
H PPKSRAIGNNSATS PRVWTQPNT\ E YTFKITVS PNKP PAVS P 
GWSPTFELTNLLNHPDHYVETENIHHLTDPTLAHVDRISETRK 
LSMGSDDAAYTQDI *RISNS WLGMVAHACNSSALGGQDGRI I *A 
QEFETSWGNIWRLRI#YRRF*NYAGMVAHTCSPSYSVD*ALIjVHQ 
KARMERLQRELB IQKKKLDKLKSEVNEMENNLTRRRLKRSNS I S 
QI PSLEEMQQLRSCNRQLQIDIDCLTKE idlfqargphfnpsai 
HNF YDNIGFVGP VP P KPKDQRS I IKTPKTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICgsVNFSAEHF^S^DLkEDLLYNLKQRGPNSSKQLLK " 
SDVNYQCLFSAHVLHJURGVLTTQPVEDERGNVFLWNGEIFSGIK 
VSAEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
H YLWFGRDFFGRRSjjLWHFSNLGKS FCLSS VGTQTSGLANQWQE 

VPAS\DFSELILSLLSFPDAI»FYNrTT^5WTTTTJTPTT.T VtfMT T»* 

VXFQQTYQHLYQR* QMKPNCI LKNLLFL* I *CCHKLHWRLI AVI 
FPMCHLQER YFKS FLLMYT * KEVIQQFI DVLSVAVKKR VLCL PR 
DENLTANEVLKTCDRKANVAILFSGGIDSMVIATLADRHIPLDE 
P IDI*LNVAFIAEEKTMPTTFNREGNKQKNKCEIPS EEFS KDVAA 
AAADS PNKH VSVPDR I TGRAGLKEL/QAVS PSRI WNFVE INVSME 
EIiQKIJIRTRICHLIRPLDTVLDDSIGCAVWFASRGIGWLVAQEG 
VKS YQSNAKWLTG IGADEQLAGYSRHRVRFQSHGLEGIiNKE IM 
MEI^SRI SS RNLGRDDRVIGDHGKE ARFP FLDENWS FLNSLP I W 
EKANLTLPRGIGEKLLLRLAAVELGLTASALLPKRAMQFGSRIA 
KMBKINEKASDKCGRLQIMSLENLSIBKETKL 


5837 


4792 


903 


NGNAVAQAP VTNCC YLATGSKDO^TIR I WS CSRGRGVM t LKLPFL 
KRRGGG I DPTVKERLWLTIiHWPSNQPTQLVSS CFGGELLQW0LT 
QSWRRKYTIiFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCWDI ATLECS WTTiPSLGGFAYSLAFS S VDIGS IAIG VGDG 
M I R VWNTLS IKNNYDVKNFWQGVKSKVTALCWHP^KEGCIiAFGT 
DDGKVGLYDTYSNKP PQISS T YHKKT VYTLAWGPP VP PMS LGGE 
GDRPSLALYS CGGEG I Vl^JHNPWKLS GEAFDIMKLI RDTNS I KY 
KLPVHTEISWKADGKIMALGNEDGSIEIFQMPNLKLICTIQQH 
HKL VNTIS WHHE \HGS PAQKLS YL \MPSGSQQCS PFTCHNLKNC 
P* KAAPES PSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH* WEGL 
VFCFP IDG YS PGCWD \AFPGKEAPVAI FRG\HQGRLLCVAWSPIi 
DPDCIYSG\AJDDFCVHKWLTSMQDHSRPPQGKKSIELEKKRLSQ 
PKAKPKKKKKPTLRTPVKLESIDGNEEESMKEMSGPVENGVSDQ 
EGEEQARE PE LPCGLAPAVS REP VI CTP VSS GFEKS KVTINNKV 
ILLKKEPPKEKPETLIKKRKARSLIjPLSTSLDHRSKEELHQDCL 
VLATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMIDIEGKG 
HLENGHPELFHOJLMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWLWAVBAFAKQLCFQDQYVKAASHLLS IHKVYBAVELLKS 
NHFYREAIAIAKARLRPSDPVLiODLYLSWGTVLERDGHYAVAAK 
CYLG ATCA YDAAKV LAKKGDAAS LRTAAE LAAI VGE DELS AS LA 
LRCAQELLLANNWVGAQSALQIjHESLQGQRLVFa^ELLSRHLE 
EKQLSEGKSSSSYHTWNTGTBGPFVERVTAVWKSIFSLDTPEQY 
QEA^KLQNIKYPSATNNTPAKQIiLLHICHDLTLAVLSQQMASW 
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Amino acid segment containing signal peptide 
(A=Alanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I»Isoleucine, K«Lysine, 
L-Leucine, M«Methionine, N«*Asparagine , 
P«Proline, Q^Glutaraine, R=Arginine, 

■3-i}CLlUC^ X — X Hi GUI 1 AUG , V°VdXJ.nc ( 

W»Tryptophan, Y=»Tyrosine r X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALXRAVVRSYDSGSFT IMQEVYSAFLPDGCDHLRDKLGD 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGERM 
LS T FKE LFS E KHASLQNSQRTVAEVQETLAEM I RQHQ KS QLCKS 
TANGPDKNBPEVEAEQPLCSSQSQCKBEKNEPLSLPELTKRDTB 
ANQRMAKFPESIKAWPFPDVLECCLVIxLLIRSHFPGCLAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


9B 


KTMPHLLVTFRDVAIDFSQEEWECLDPAQRDLYRDVMLENYSNL 
ISLDLESSCVTKKLSPEKE1YEMES\PSGRIWGNVSTITFQYNG 
LGDNME CKGNLEGQVS KS EGLYMCVK I TCE E KATESHSTS S TFH 
RI I /HYGjGKl VKCKE CRQGFS YLSCLIQHEENHNI* KCSEVNKH 
RNTFS KKPS Y I * HQ\ KFRLGEKP YECMECGKAFGRTSDLIQHQK 
IHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FS YCS QYTLHQR IK5GEKP YECKDCG KAF1LG S QLTYHQRIHSG 
EKPYECKECGKAFILGSHLTYHQRVHTGEKPYICKECGKAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKECGKAFISNSNLIQHQRIHTGE KP YKCKECGKAF I CGKQLS 
EHQRIHTGEKPFECKECXSKAFIRVAYIjTQHBKIHGEKHYECKEC 
GXTF VRATQLTYHQR I HTGEKP YKC KBCDKAF/HLWLT I L5EHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 

grafsrgsehtlhqrihtgekpytcvqcgkdfrcpsqltqhtrl 
hn*eysshkicmhsialasldfahlqeknpen 


5839 


1 


2425 


grpfprppralprlplrgrrqdgrwtvdfeeclkd\sprfraal 
eevegdvaelelkl\dklvkix:ia\midtgkafcvankqfmngi 
rd\laqns\nnda\wetkfapsfldslqeminfhtil/i,*pns 
ein*ghsfqnfvkedi^kfkdakkqfensq*krkkialvkkapv 
psrpaslel*kppniltatrkcfrhialdyvlqinvdqskrrse 
ilkswlsfmyahlaffhqgydlfsblgpywkdlgaqldrlvgda 
akekremeckhstiqqkdfsrddsklkynvdaangivmegylfk 
rasnafktwnrrwfs iqnnqwyqkkfkdnptwvedlrlctvk 

HCED I ERRFCFE WS PTKS CMLQADSE PCLRQA W I KAVQTSI \AT 

ayrbkddesekldkksspstgsldsgneskekllkgesalqrvq 
cipgnasccdcgladprwasinlgitlciecsgihrslgvhfsk 
vr5ltldtwepellki^cei^1tovinrvyeanvekmgikkpqpg 
qrqekeayirakyverkfvdkifl*slspp\bqqkk\fvsksse 

EKRLSISKFX3P\GDQVRASAQSSVRSNDSGIQQSSDDGRBSLPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPXM 
AE1ALAHG ADVNWANS EENKAT PL 1 QAVLGGS LVTCE FLLQNGAN 
VNQRDVQGRG PLHHATVLGHTGQVCL FLKRG ANQHATDEEGKD P 
LS I AVEAANADI VTLLRLARMNEEHRESEGLYGQPGDETYQDI F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHL PRQHLTTLWQ I S S PRWRS PQRAFMSALSKTQTQSAPALQ 
GLSSLLQS VTGNPVPASEAASQSTSASPANTT VYTI KGRNLPSS 
AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGLPSTAFKLP 
SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 
\2MKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTS TS I DNI DGTP VR0ERS GTPTQDEMMDKPT SS S VDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLKDSSQBKFYPDTSFQEDEDYRDFEYSGP 
P PS AMMNLQKKPAKS I LKS S KLSDTTE YQ P I L S S YSHRAQE FG V 
KS AF P PS VRALLDSSENCDRLSS SPGLFGAFSVRGNE PGSDRS P 
S PSKNDS F FTPDS NHNS LS QS TTGHLS L PQKQYPDS PHP VPHRS 
LFSPQNTIJUU>TGH?PTSGVEKVLASTISTTSTIEFKNMLKNAS 
RKPSDDKHFGQAPS KGTPS DGVSLSNLTQPSLTATDQ QQQBEH Y 
RIETRVSSSCLDLPDSTEBKGAPIETLGYHSASNRRMSGEPIQT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 
SIiGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LiPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 
VDLSNPFTKEAALAHAAPP PP PGEHSG I PFPTPP PP P P P GEHS S 
SGGSGVPFSTPPPPPPPVDHSGWPPPAPPLAEHGVAGAVAVFP 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, GwGlycine, 
H=Histidine, I=Isoleucine, K«LyBine, 
T.=Leucine, M=Methionine, N«Aeparagine, 
P«*Proline, Q=Glutamine , R-Arginine, 
SaSerine, T«Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X«Unknown, *aStop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) 








TLPSHSLEHLGPPkGCGGGGGSN&SSGP PLGPSHRDTI SRSGI I 

LRSPRPDFRPREPFLSRDPFHSLKRPRPPFARGPPFFAPKRPFF 
PPRY 


5841 


1908 


762 


GLRLFLVLT VW PMMKPS WL3RTE FS KRIiLCRTL WCQSGW S SRS Y 
TRSMLKMTTS INRRSRTSTKSTRTSARPGLTATVS IGLSDS PTW 
RHCWMTARSCSGEKGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 
GDGLDSGLARRGSAVSALASGLVEEPMLGPPFHPTPRFKAVSAK 
SKEDLVSO/3FTEFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSD YLW YLRLLTSG YLQRES KFFEHFI EGGRT VKE FCQ \QE 
\VEPMCKESDHIHIIALAQGLQRVHPGWBYMGPRPRAATTNPHI 
FP * G LPS P KVYLLYRPG \HYD XL YKIGLGS S PLGCPGCPLLARA 
LGHCYRG FS WVKWS YFTPFFLSHDPPPMFY 


j 5942 


307 


1918 


QEPTADFKLRSTCGCGREMTCPDKPGQLINWFICdLcVPR^Kir* 
WS SRRPRTRRNLLLGTACAI YLGFLVSQVGRASLQHGQAAE KGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNWYITLRSK 
RS KPANTRGTVXPKRRKKHAVASAAPGQEALVG PSLQ PQEA\ EG 
KLML * HLGTLREQTWLRLESDPGGWCGVRE/ WRAGGPDFLQPS S 
RESN IRI YSESAPS WLSKDDIRRMRLLADSAVAGLRPVS SRSGA 
RLLVLEGGAPGAVLRCGPS PCGLLKQPLDMS E VFAFHLDR I LGL 
NRTLPSVSRKAEFIQDGRPCPIILWDASLSSASNDTHSSVKLTW 
GTYQQLLKQ. KCWQNGR VPKPES G CTE I HHHB WS KMAL FD FLLQ I 
YNR LDTNCCGFRPRKEDACVQNGLRPKCD0QGSAALAHI IQRKH 
DPRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 
LRQ KLLQSLFI<DKG YWESQGGRQGI EKLIDVIEHRAKILI TY IN 
AHGVKVLPMNE 


5843 


$00 


1453 


GTARLVTCWVLHGQ*VKKPAWEPGWWL*Q*RCRPKGWGLGAGM " 
R3SRMSQPPQCLRRAQSSCCHFMVKLLDDGTFMI PGEKVAHTSL 
DALVTFHQQKP I EPRRBLLTQPCRQKDPANVD YEDLFLYSNAVA 
EEAACPVSAPEEASPKPVLCHOSXERKPSAEM/RQNNHQGSHFL 
LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCBLWT 
LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHISQKP 
LTAPGTKRQKG PHQEGREVGQLH*GD PRGQELAPNGS ES P I LPG 
VQARAPGLGRA 


5844 ■ 


202 


2471 


FDSAVLSS INVMAVLPGPLQLLGVLLTI SLSS IRLIQAGAYYG I 
KPLPPQI PPQMPPQI PQYQPLGQQVPHMPLAKDGLAMGKEMPHL 
QYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKE I PLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPMGIP * PQGPPGPHGLPG IGK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGL 
PG LPG P PGLPG I GKPG FPG PKGDRGMGGVPGALGPRGEKGP IGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 
PKGEPGLQGFPGKPGFLGEVG PPGMRG FPG PI GPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGPPGPPGPPGPPAVMPPTPPPQGEYLPDMGLGIDGVKPPHAYG 
AKKG KNGGPA YEMPAFTAELTAPF PP VGAP VKFNKLLYNGRQN Y 
N PQTG I FTCEVPGVY YFAYHVHCKGGNVWVALFKNNE PVM YTYD 

E YKKGFLDQASGSAVLLLRPGDRVFLQM PS EQAAGL YAGQYVHS 
SFSGYLLYPM 


5845 " 


215 


2061 


HASNKSASLQDKMANPKEKTAMCLVNELARFNRVQPQYKLLNER 
GPAHSKMFSVQLSLGEQTWESEGSSIKKAQQAVGNKALTESTLP 
KPI*KPPKSNVNNNPGCITFTVELNGLAMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVMYNQRYHCPIPKIFYVQLTVGNNEFFGEGKT 
RQAARHNAAM KALQALQNEPI PERS PQNGES GKDMDDDKDANKS 
E ISL VFE I ALKRNMFVSFEVI KES GPPHMKS FVTRVS VGEFS AE 
GEGNS KKLS KKRAATT VLQELKKLP PLPWEKPK\HFFKKRPKT 
IVKAGPEYGQGMNPISRLAQIQQAKKEKEPDYVLLSERGMPRRR 
E FVMQ VKVGNE VATGTGPNKK1AKKNAAEAMLLQLG YKASTN^ 
E3QLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(Alanine, OCysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H-Histidine, I«Isoleucine, K«Lysine, 

I L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 

| S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

I \ "possible nucleotide insertion) 



KHKV1SGTTLGYLSPKDMNQPS SSFFSISPTSNSSATIARELLM 
NGTSSTAEAIGLKGSSPTPPCSPVQPSKQI.EYLARIQGFQVHYC 
DRQSGKECVTCI1TI1APVQMTFHAIGSS I EASHDOV* YATAILLC 
YGPARKWKAIKMRAMCAHAALLSLIHYLIiAPSARLBKSKI^ALG 



FSKUIKKTFIxoiSGVTNSGKTTLAKNLQKHtPNCSVISQDDFF 
KPES B I ETDKNG PLQ YDVLEALNMBKMMS AI S CWMBS ARHS WS 
TDQ3SAEEIPILIIEGFLLFNYKPLDTIWNRSYFLTIPYSECKR 
RRSTRVYQPPDSPGYFDGH^PMYLKYRQEMQDITWEVVYLDGT 

KSEEDLFLQVYEDLIQEUUCQKCLQVTA*RRNTTNPS/CK*IRK 
LQG VI 



584B 



22 



APEMEDLS SPDSTLLQGGHNLLS S AS FQ US VTFKDV I VDFTQE E" 
WKQLDPGQRDLFRDVTLENYTHLVS IGLQVSKPDVISQLEQGTE 
PWIMBPS I PVGTCADWETRLENS VS APE PDI S E EELS PE V I VEK 
EKRDDSWSSNLLESWEYEGSLERQQANQQTLPKEIKVTEKTIPS 
WE KGPVNNE FGKS VNVSSNLVTQE PS P EETS TKRS I KQNSNPVK 
KE KSCKCNECG KAFS YCS ALI RHQRTHTGEKP YKCN* /CVEKAF 
S RS ENL INHQR IHTGDKP YKCDQCGKG FIBGPS LTQHQR IHTGE 
KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECGKAFSQRG 
HFMEHQKIHTGEKPFKCDECDKTFTRSTHIiTQHQKIHTGEKTYK 
CMECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQ KTHTGEKP YDCAECGKS FS YWSSLAQHLKI HTGEKP YKCNEC 
GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 
HTQEKAYECKECGKAFIRS S S LAKHER IHTGE KPYQCHECGKTF 
* SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECA3CGKAFRHCSSLAQHQKTHTEEKPYQCNKCBKTFSQSS 
HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGBKPYK 

CNECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSAZjN 
KHQRLHPGX 



1835 



AAPRRLLRUUDGDRTPRFPLPALLRPGPPAKAAPERRKMPAVSK 

GDGMRGIAVFISDIRNCKSKEAEIKRINKELANIRSKFKGDKAL 

DGYS KKKYVCKLL FIFLLGHDIDFGHMEA VNLLSSNR YTEKQIG 

YLFISVLVNSNSELIRLINNAIKNDLASRMPTFMGLALHCIASV 

GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDL 

VP^DVTTSRVVHLl^QHLGWTAATSLITTLAQKNPEEFKTSV 

SLAVSRI,S\RIVTSASTDLQDYTY*FCPGFIiGLSVKLLRLLQCY 

PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 

ISLIIHHDSEPNLLVRACNQIiGQFLQHRETNLRYLALESMCTLA 

S S B FS HEAVKTH I ETVINALKTER DVS VRQRAVDLLYAMCDRSN 

APQIVAEMLS YLETADYS IREE I VLKVAI LAEKYAVDYTW\ YVD 

TILNLIRIAGDYVSEEVWYRVIQIVINRDDVQGYAAKTVFEALQ 

APACHENLVKVGGYILGEPGNLIAGDPRSSPLIQFHLLHSKFHL 

CSVPTRALI^TYIKFVNLFPEVKPTIQDVLRSDSQLRNADVBL 

QQRAVE YLRLSTVASTDI tATVLEEMP P FPERES S I LAKLKKKK 

GPSTVTDLEDTKRDRS VDVNGGPEPAPAS TSAVS TPS PSADLLG 

LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 

RF VCKNNGVLFENQLLQIGLKS E FRONIjGRMF I FYGNKTSTQFL 

NFTPTLICSDDLQPNLNrOTKPVDPTVEGGAQVQQWNlECVSD 

FTEAPVLNIQFRYGGTFQNVSVQLPITLNKFFQPTEMASQDFFQ 

RWKQLSNPQQEVQN I FKAKHPMDTE VTKAKI IGFGSALLESVDP 

NPANFVGAGIIHTKTTQIGCLLRLEPNLQAQMYRLTLRTSKEAV 
SQRLCELLSAQF 



KRREIKETVFHHVAQAGLELL SSSNPPSSASRSAGITGMRHQVd 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGIEVEES 
DEFIREDMKYKDATNKHSHLHRBDKHITIEDLWKRWKTSEVHNW 
TLEDTLQWLIEFVELPQYEKNFRDNNVKGTTLPRrAVHEPSFMI 
S QLKI SDRS HRQKLQLKALD WLFGPLTRP PHNWMKDF I LTVS I 
VI GVGGCWFAYTQNKTSKEHVAKMWKDLES IiQTAEQSLMDLQER 
LBKAQEENRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 
CELSRRQYAEQEIjEQVRMALKKABKEFELRSSWSVPDALQKWLa 
LTHEV E VQ Y YN I KRQNAEMQLA IAKD EAEKI KKKRS TVFG TLHV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal pepticfe" - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=*Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P^Proline, Q=Glutaraine, R=Arginine, 
S»Serine, ToThreonine , V«Valine, 
W-Tryptophan, Y-Tyrosine, X-Unlcnown, **stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDEVDHKHjEAKKAIiSELTTCLRERLFRWQQI^klCXSFQ 
IAHNSGLPSLTSSLYSDHSWVVMPRVSIPPYPIAGGVDDLDBDT 
PPIVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNBEE 
EEAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 
IIS/DERYQEMRCP*RIPSGGIL 


5850 


3 


1895 


KAVLNF SASGS VISIjTGSNPMHDASMWHLKKNGI IVYIjDVPLLN 
LI CRLKLMKTDRI VGQNSGTSMKDLLKFRRQYYKKWYDARVFCE 
SGAS PEEVADKVLNAI KR YQDVDS ETFI S TRHVWPBDCEQKVSA 
EFFIEAVIEGIASDGGLFVPAKEFPKLSCGEMKSLVGATYVERA 
Q I LLERCIH PADI PAARLG EMI ETAYGENFACS KIAPVRHLSGN 
QFILELFHGPTGSFICDIiSLQLMPHIFAQClPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQI IGSQ 
RENGWAVGVESDFDFCQTAI KR I FNDSDFTGFLTVE YGTILSSA 
KS INWGRLLPQVVYHASAYLDLVSQGFI S FGS P VDVC I PTGNFG 
K 1 LAAVYA KMMGI P I RKFI CASNQNHVWTDF I KTG\HYDLRGKE 
K*AQTFFTVQ* IFLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKLCXJDFVADWCSEGECIAAINSTYNTSGYILD 
PHTAVAKWADRVQDKTCP VI I SS TAH YS KFAPAIMQALKIKE I 
I^TS SS QLYLLGS YIlALPPIiHEALLERTKOQEKME YQ VCAADMN 
VtKSHVEQLVQNQFI 


5851 


3120 


1802 


RCYLQFLAIiLLTSTSARAAAAIAAAEBPAGSPSVMTRAGDHNRQ 
RGCCGSLADYLTSAKFLLYLGHSLSTWGDRMWHFAVSVFLVELY 
GNS LLLTAVYGLWAGS VLVLGAI IGDWVDKNARLKVAQTSLW 
QNVSVILCGI ILMMVFLHKHELLTMYHGWVLTSCYILI ITIANI 
ANLASTATAITIQRDWIVWAGEDRSKLANMNATI RRIDQLTNI 
IJ\PMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVLLWKVYQKT 
PALAVKAGL KEEETELKQLNLHKDTEPXPLEGTHLMGVKDSNIH 
ELEHEQEPTCASQMAEPFRTFRDGWVS YYNQPVF/ LGWHGSCFP 
LYDCPGL*LHHHRVRLHSGTEWFHPQYFDGS ISYNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 - 


KriTPSSIiCPLRQLPE^VSGOPLTDPLIStCRSHKCRGKGWG"" 
SSS YPSLPALLRARSAPGHCTHRSCGPEWRIDSI SRLEMQGARR 
SGWAQAQPT I LLLV PRLRKSLP5 1 WG /S LMG F FI TSGPG / WFRQ 
YYFFI SGRH* VLFTESDFY YVAMDPGGHGLSSHYS PGVPYYLQT 
FVS BIRR WAG KXQSVYFRRCGGCS RAP P LI TGGGVGSRKQRWP 
ESGAWAIiAPGLPAIHGRSWES 


5853 


223 


1346 


RLLGLSRVKGLHGPAASAWISDPETRGDPGGPWGMWRG6DLRPR 
PVSLTGLTLVCK*AAQGPQV\HS VKLCFGLGG\PCLL\FPI FRP 
LLLKPRRPRLH PGTRG VAVEPHALR WHVAHG E E AG I RAAGPGH 
GGVEI PQG/VGSLGARRGLRPSRPSSRHRKRVPAPPPGRPLATP 
HRRR FP PDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGES E 
SRWVRGNFRTGTAATLIGFSRNPTLNGSENWGSIjVSIQEEGPDT 
GWEREKRNPAEMGNPQRWAS P I HTPPLGPE I LRAMPEALRAM PE 
ALGLRPDPATSVPSALS/QTF/PESWPRSCIiRNQGETLGMGPVP 
LSSIjCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


kgrntapekkgAAlnnrenass*ngy/srWKqdirrienhiiqe 
lxhlcamikrvlleri^ntrklreltegrtldwpqnritevsak 

RQIVTEYRE KGKRN + EE KKRDLEGRSRR YNLCI IG I PETEDRAS 
GABTIKDIiLE/ENFPELKNELDLQMEKAHR I PLKFNEKXAASRH 
IRVTFL/KFQRRNILQASSQRKQVTYKGAKVRLTSDFSPAILNA 

rrqw/n/pisrvlrennfepriiysaklsflykgnwktfldiqg 
lgkyinqelslkillkdllqltenln 


5855 


536 


2391 


lrsygckapsrishlhk\flfh*lpsllmgysespppitdswap 
fislthhvlsqsqsplssncwiclsthtq*ftalpadlltwtqs 

NVS LH IS YLAI PFLAD9 FIiKP V/ L * PGKSAKHLSFKLSSLSMVS 

gravallhli a5glts iqtntasskppi wgy\lstqtsfi sppp 
lciisrtypnpahatmvgqvpqslcgli ftl/rtpcrps ilhpny 
kiistsawqkvlcfsgsptihtslhlttgssflsfhpipgfpaa 
nsalyvsslkgppgknvtipspvtgt*qpphrgsn/rltvdkdn 
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SEQ 
ID 
NO: 


Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=*Alanina, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, Iaisoleucine, K=Lysine, 
L= Leu cine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S«Serine, T-Threonine , V=Valine, 
Wo Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








fflspkpnslhqlpsq\TpVo>Ltcaalagsypiwenentlswl 
ptfrynfclstpslfflcdtn*ylclpanwsqtctlvfqaptin 

ILPPNQT I L I S VEAS I S SS P I RNKWALHLI TLLTGLG 1 TAALGT 
G I AG I TTS I TS YQTLFTTLSNTVEDMHTS I TS LQRQLDF LVGVI 
LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL*HQVADSWWQGSSLLRWIPWAPFLGPLIFLFLLLMIGP 
CI FNLVS RF I SQRLNCF I QASMQKHI DN I FHLCHV* YQS LRGNH 
SEAPEPRP 


5856 


173 


113-7 


PWLHGLGLSAVFLFYL+/YVTFHLYGGIILLLLIFISIAGILYK 
FQDVLLYFPBQPSSSRLYVPMPTGIPHENIPIRTKDGIRLNLIL 
IRYTGDNS PYS PTI IYFHGNAGNJGHRLPNALLMLVNLKVNLLL 
VDYRGYGKSEQEASEBGLYIjDSEAVLDYVMTSPDLDICTKIYLSG 
RSLG\GAAAIHIiASDNSHRISAIMVENTFLSIPHMASTLFSFFP 
MRYLPLWCYKNKFLS YRKISQCRMPS LFISGLSDQL I PPVMKKQ 
LYELS PS RTKRLAI FPDGTHNDTWQCQGYFTALEQFIKEWKSH 
SPEEMAKTSSNVTII 


5857 


1597 


5*3 


kligkvlvi^wadamaafavbpqgpaixssbpmmlgsptspkpg 

VNAQFLPGFLMGDLPAPVTPQPR8 I SGPSVGVMEMRS PLLAGGS 
PPQPWPAHKDKSOAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQSPLVGVTSTPGTGQSMFSPAS I GQPRKTTLS PAQLDPFYTQ 
GDSLTS BDH\ LDDS WGDC I WGFLKAS A\S Y I LL \QFAQYGGIS * 
NMWMSNTG^MHIRYQSKLQARKAI»SKDGRIPGES1MIGVKPCI 
DKS VMES SDRCALSS PSLAFTP P I KTLGTPTOPGSTPR T Q tmd d 

iataykastsdyqvisdrqtpkkdeslvskameymfgw 


5858 


335 


1419 


PPHQPAAAS TSXHQQQQP P PPPOJDS S KP WAQG PGPAPGVGSAP 

passsappatpptsgappgsgpgptptpppavtsappgappptp 

PSSGVPTTP PQ AGGPP PP P AAVPGPGPG PKQG PG PGGPKGGKMP 

ggpkpgggpglstpgghpkpphrgggeprggrqhhppyhqqhhq 

GPPPGGPGGRSEEKISGPRRGFKANLSUIjRRPGEKTYTQRCRFC 
LI/3IYLLISRRMNSRRLFAKIWENQEKFLSTKAXDSEFIKLESR 
ALA*NCPKPELG * YTP*GGRQLPSSLFPTHACLPI»SCS VI FSPF 
KFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 

FAS " : 


5859 


307 


1503 


GGSSARPRASSRRMLSRKKTKNEVSKPAEVO^KYVKKfiTrSPLLR 
NLMPS FIRHGPTI PRRTDI CLPDSSPNAFSTSGDGWSRNQSFL 
RTPIQRTPHEIMRR3SNRLSAPSYLARSLADVPREYGSSQSFVT 
EVS FAVENGDSGSR YYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASGIGRVAATSLGNLTNHGSEDLPLPPGWS 
VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGT 
YYVDHTNKKAQY\RHPC1APTCTS V*ST , TSCI1I/AS /RQQTERNQ 
SLLVPANPYHTAEI PDWLQVYARAPVKYDHILKWELFQLADLDT 


5B60 


2956 


1270 " " 


TIRVEEFPLCPGGGKAQI^SASLLGAGI^PPTPPPL^LLtFP " 
LLLFSRLCGALAGPI IVEPHVTAVWGKNVSLKCLI EVNETITQI 
SWE KI HGKSSQT VAVHHPQYG FS VQGE YQGRVLFKNYS LNDA^I 
TLHNIGFSDSGKYICKAVTFPLGKAQSSTTVTVLVEPTVSLIKG 
PDS LI DGGNETVAAXCI AATG KP VAHIDWEGDLGEMES TTTS FP 
NETATIISQYKLFPTRFARGRRITCVVKHPALEKDIRYSFILDI 
QYAPEVSVTGYDGNWFVGRKGVNLKCNADANPPPFKSVWSRLDG 
QWPDGLLASDNTLHFVHPLTFNYSGVYI CKVT\NSPGS KEVTQK 
VHPTFQDP SLPTYP PLPALQ FQ WAS PSTA* TSRD \ LATEP+ KIA 
PSPLSTL\ATI KGWTQLPTI IA* CSGVGALFI V\LVKCFGLG I F 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 
KKENKNPVNNLIRKDYLEEPEKTQWNNVEOTiNRFBRPMDYYEDL 
KMGMKFVSDBHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRWMAfeLL"" 
SEGEQGI PTACAAFAQQPAG/ E PRRGLAGVGEGGPQCS WVNYRC 
TLEFLVSLLGTDLARGRGNSASGPTAPADSKQL/ML * DVHRRVI 
LE*RMNSGSPARDNAPSQRFCTNLSEGLRFGISPSWREAJUYGCH 
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Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G»Glycine, 
H~Histidine, I-Isoleucine, KsLysine, 
L«Leueine, M^Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=»Serine, T=Threonine, V» Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








A ■ 


5862 


1556 


483 


PP FQL I MGE I KVS PDYNWFRGT VP LKKI I VDD DDS KIWS L YD AG 
PRS I RCPL I FLP P VSGTADVFFRQ I LALTGWG YRVT ALQ YP VYW 
DHLEFCDGFRKLIiDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
R VHSLI LCNS FSDTS I FNQTWTANS FWLMPAFMLKKIVLGNfFSS 
GPVDPMMADAI D FMVDRLESLGQS ELASRLTLNGQNSYVEPHfC I 
RDI PVTIMDVFDQSALSTEAKEEMYKLYPNARRAHLKTGGNFPY 
LCRSAEVNLWQIHL/R/RNSMEPNTRPLTHQWSVPRSLRCRKA 
ALASARRSSSVSLAVNDEtiTRCVLV*SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPLAAPREDTMGPLMVLFCLIiFI*YPGliADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGLYPSPASRliCKSSGQWQ 
TPGATRS LSKAVCKP VRC PAP VS FENG I YTPRLGS YPVGGNVSF 
ECEDGF I \LRG S PVRQCR PNGMWDGETAVCDNGAGHCPN PG I SI* 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 
PI CRQP YS YDFPED VAPALGTSFSHMLGATNP.TQKrKESIiGRKI 
Q IQRSGHLNL YLLLDCSQS VS END FL I FKE5AS LM VOR I FS F3 1 
NV S VAI I TFASEP KVLMS VLNDNS RDMTE V I S S LENAN YKDH3N 
GTGTNTYAALNS VYLMMNNQMRLLGMETMAW\QE I RHAI I LL\T 
DGK\SHMGGS PKTAVDHIRE ILNINQKRNDYLDI YAIGVGKLDV 
DWRELNSLGSKKDGERHAFIIjQDTKALHQVFEHMLDVSKLTDTI 

cx3vgnmsanasdqertpwhvtikpksqet\c\rgalisdqwvlt 
aahcfrdgndhslwrvnvgdpksqwgkefliekavispgfdvfa 
kknqg il\ ef ygd \ d i all \ klaqkvkm\ sthcqg psclp \ ctm 
\eanlgflretfkgstcr\dhenel/vwnkqsv\pahf\val\n 
gsklehltlrmgvewtsccrglspkkktm\fpnlt\dvre\vvt 
D\ QFL\ cs \GPQEDESP \ CK * e\ sgga\ vflekr FKLSAGGVWC 
SWGL\YNP\CT^SA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q* S P WLRQHPGGM S * I FL PL LANGHLS P FACPAR I CRPLKFLPS 
EWATLRTL 


58*4 1 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSFLWACSFTMG" 

KLPPSIPPSSPIjACVLKNLKPLQLTPDLKPKCLIFFCNTAWPQY 

KLDNDS K* PENGTFE FSILQVLDNSCHKMGKWS EVPDVQAF F \ S 

HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 

MDSS DLPPSPQAAPRQAEPGPMSHLASAPPPYNPFI TSPPHTWS 

SIiQFHSVTSPPPPAQQFTLKKVAGAKGIViCVSAPFSLSQIR*RL 

GSFSSNIKIQPSSWLIWQQP 


58fJ5 


568 


1684 


CLPGPRWGEGWRAGHTI VGCI FFKTAI ISHFKGGMYLCVCMCTC""" 
LSVCVCVQVGSWICV/CVSMCACVSLC?C\ICRCISMYTRBHAC 
ACTRV*VYMC^S/VCTCVSTCI0VRVCAHVCVYMCLCIjGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/CVY VCVLCVWACMRMSTCVWLVYG *ACTCVWMHM/CSCTCR/C 
VHVCCMS MHACE CLCVYLHI CGCAGTRRWWAGSARGS RS CSRL P 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
GSPAAVCSRNCTVS PRRGADCFEAPDVPKQPPGWGRASFEERGC 
GGRGWVCAPPLKGPQCCCFSIKPELKAKKKK 


5B66 


98 


3197 


ARPEVPAP PAWLS RRGAAKMGDKKDDKDS P KKNKGKERRDLDDL 
KKEVAMTEHKMS VBE VCRKYNTDCVQGLTHSKAQE I LARDG PNA 
LTPPPTTP BW VKFCRQLFGGFS ILLWIGAI LCFLAYGIQAGTED 
DPSGDNLYLG I VLAAWI ITGCFS YYQEAKSSKIMESFKNMVPQ 
QALVIREGEKMQVNAEEVWGDLVEIRGGDRVPADLRI ISAHGC 
KVDNSSLTGESEPQTRSPDCTHB\NPLKTRNITFFSNNFVEGTA 
RGWVATGDRTVMGR I ATLASGLE VGKTP IAX EX EHF I QLI TGV 
AVFLGVSFFI LSLILGYTWLEAVI FLIGI I VANVPEGLLATVTV 
CliTLTAKRMAHKN CLVKNLEAVE TLGST ST I CSDKTGTLTQNRM 

tvah^fdnqiheadttei)qsgtsfdksshtwvalf*h/lu;fc 

NRP VFKGGQDN I P VL KRD VAGDAS E SALLKCI ELS SGSVKLMRE 
RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILIiQGKEQPLDEEMKEAFQNAYLBLGGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGI KVIMVTGDHP ITAKAIAKGVGI I FEGNETVEDIAARL 



390 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amino acid segment containing signal peptide 
{A=Alanine, ,C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
Ii=tieucine, MsMethionine, N-Asparagine, 
P=Proline, Q-Glutamine, R*Arginine # 
S=Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /=possihle nucleotide deletion, 
\=possible nucleotide insertion) 








NIPVSQVNPRDAJ<AC\/^H6TDLKDFTSEQIDBltQNfaTEIVFAR 
?S PQQKL 1 1 VEG CQRQGAI VAVTGDGVNDS PALKKAD IGVAMG I 
AGSDVS KQAADMILLDDNPAS I VTGVEEGRLIFDNLKKSIAYTL 
-SNIPE ITPFLLFIMANIPLPLGTITILCIDLGTDMVPAlSIiAY 
EAAESDIMKRQPRNPRTDKLVNERLISMAYGOIGMIQALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDIiEDSYGQQWTYEQRK 
WEFTCHTAFFVSI VWQWADL 1 1 CKTRRNS VFQQGMKNKI L I F 
GLF3ETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDEIRKLILRRNPGGWVEKETYY 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSS PVAKP 
G PVKTLTRKXNKKKKRFWKSKARE VSKKP ASflPfSA WP r> oirn ni? 

DPSQNWKALQEWIiLKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTMGDIVPERGDIEHKKRKAK\GQPQPHPPR/IEH 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRAIJu^CEI^VGPJCGEESMAARVSIVWOYGK(^YDKYVKP 
TE P VTDYRTAVSGIRP ENLKQGEELE WQKEVAEMLKGRI hVGH 
ALHNDLKVLFLDHPKKK I RDTOKYKP FKSO VKSGRP «; T .T? T T qrv 

ILGLQVQQAEHCSIQDAQAAMRLYVMVKKEWESMARDRRPLLTA 
PDHCSDDA* QS CPAAAAAPIiQRQCDQSQGQ I TS PQSGNSGETFS 
ESWQRGVAWCY 


5866 


2122 


833 


ltagashtqdasqsts akypaAaOni*/ cvtw amredladi wyir 

AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGIj VTRLRERPALLVS STSWTEDEB FS I IiLAA 
LESRV* T\MTLDGHNL P5LVCVI TGKGPLREY RtTwrnrwimu 
IQVCTPWLEAED YPIiLIX»SADIX3VCLHTSS SGLDLPM KWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFBDSEEIAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ cvtnamredladi w yir 
AVTVYDKPASFFKETPLDI^HRLFMKLGSMHSPFRARSEPEDPV 

tersafterdagsglvtrlrerpallvsstswtededfsillaa 
lesrv*t\mtldghnlpslvcvitgkgplrbyysrlihqkhfqh 
iqvctpwleaedyplllgsadi^vclhtsssgldlpmkvvdmfg 
cclpvcavnfkclhelvkheenglvfedseelaaqlqmlfsnfp 
dpagkiinqfri<niiresqqlrwdeswvqtvlplvmdt 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNIj/CVTEWlMRfibLADI WYXR 
AVTVYDKPAS P FKETPLD LQHRL FMKLGSMHS P FRARS E P EBP V 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSI LIjAA 
LESRV*T\MTLDGHNLPSLVC7ITGKGPLREYYSRLIHQKHFQH 
IQVCTPWLEAEDYPIJXGSADLGVCLHTSSSGLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVIiPIjVMDT 


5871 


3 


3465 


FFFCRPIiRLYSKTTGDRSAMAGAAGLTAEVSMkVLERRARtKR*S r ~ 
VLKLL*I,SLRRL*LEPTI*NGLLT*CSRLSVFRFLKV\GSVYEP 
LKSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGGDQKAK IQDS LYCAAGAWALALAYRRIDDDKGRTHE LEHSAI 
KCMRG I L YCYMRQADKVQQFKQDPRPTTCLHS VFNVHTGDELLS 
YBE YGHIiQINAVSIiYLLYLVEM I SSGLQI I YNTDEVS FIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L* KQFNGFNLFGNQGCSWSVI FVDLDAHNRNRQTLCSLLPRESR 
SHNTDAAIiLPCISYPAFALDDEVLFSQTLDKVVRKLKGKYGFKR 
FLRDGYRTSIiEDPNRCYYKPAE IKLFDGI ECEFP I FFLYMMIDG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPVVPKYYYVPADFVEYE 
KWNPGSQKRFPSNCGRDGKLFLWGQALYI 1 AKLLADEL IS PKDI 
DPVQRY VPLKDQRNVS MRFSNQGPLENDLVVHVALI AESQRLQV 
FLNTYGIQTQTPQQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 
PDRP I GCLGTS KI YRI LGKTWCY PI I FDLSDFYMSQDVFIiLID 
DIKNALQFIXQYJWKMHGRPLFLVLIREDNIRGSRFNPILDMIAA 
LKKG I IGGVKVHVDRLQTLISGAWEQLDFLRI SDTBELPEFKS 
PEEI^PPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 
2KLNDCS CLASQAILLGIIXKREGPNFI TKEGTVSDHIERVXRR 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Ioisoleucine, K=Lysine, 
L= Leucine, MsMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
WoTryptophan, YaTyrosine, X=Unknown, *=»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSQKLWSVVRRAASLLSKWDSLAPSITNVL\/Q<3kQVTIiGAFG~ 

KBEEVXSNPi,SPRVIQNIIYYKCNTHDERBAVIQQELVIHIGWI 

I SNNPELFSGTLKIRIGWI IHAME YELQIRGGDKPALDLYQLSP 

SEVKQLLLDILQPOQNGRCWLNRRQIDGSLNRTPTOFYDRVWQI 

LERTPNGIIVAGKHLPQQPTLSDMTMYEMNPSLLVEDTLGNIDQ 

PQYRQIVVELLMVVSIVLERNPELEFQDKVDLDRLVKEAFNEFQ 

KDQSRIjKEIEKQDDMTSFYNTPPLGKRGTCSYLTKAVMNLLriEG 

EVKPNNDDPCIilS 


5872 


68 


665 


VQGYMYRFVlt kilNSCYSEKTS I CRHRCCPELPATQPWPTPTVFF 
NIAIDSESLGCI \SFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYRIIPG\LCQGGDFTHHNGTGGKSLYSKEFDDENFI/IjKH 
TAPG VLSTANAGPTTNGSQ FF I CTAKTEDG * QHVVFGKVKDGM5 
IVEALERSGSRNG FCTSKKI TAANCGQL 


5873 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSV/VAGGFGNAASAR 
HHGLLAS ARQ PGVCH YGTKLACCYGWRRNS KGVCEATCE PG CKF 
GBCVGPNKCRCFPG YTGKTCSQDVNE CGMKPRPCQHRCVNTHGS 
YKCFCLSGHMIiMPDATCVlISRTPAMTHrY^vcr'C'riTirB'on/v^T nt* 
S SGLRLAPNGRDCLDIDE CASGKVX C P YNRRCVNTFGS YYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENS VXEVLRAPGT I KDR I KKLLAHKNSMK 
KKAKIKNVTPEPTRTPTPKVNLQPFNYEEIVSRGGNSHGG\KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGLIL\VQRKALTSKLEKKADLNISVDCSFNHG\ICDW\KQDR\ 
EDDFDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLLL 
PDLQ PQSNFCLLFD YRIiAGDKVGKLRVFVXNSNNALAWEKTTSE 
DEKKKTGKIQLYQGTDATKSIIFEAERGKGKTGEIAVDGVLLVS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRIJ^RRRVRSLRRRRGWLRARWSRGQNKMAARRITQETFD 
AVLQE KAKR YHMDAS GEAVS ETLQFKAQDLLRAVPRSRAEM YDD 
VHSDGRYSLSGSVAHSRDAGRBSLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 
EQDFGH P VS QES S WS QEYS FGPS AVLGDFGSSRL I E KECLEKE \ 
SRDYDVDHSG\EA\DS VLRGS \SQVQA\RGRALN I VDQEG S LLG 
. KGETQGLLTAKGG VGKLVTLRNVSTKKI PTVNR I TPKTQGTNQ I 
QKNTPSPDVTLGTNPGTEDIQFPIQKIPLGLDLKNLRLPRRKMS 
FDIIDKSDVFSRFGIEIIKWAGFHT1KDDIKFSQLFQTLFELET 
ETCAKNLASFKCSLKPEHRDFCFTTIKFLKHSALJCTPRVEINEFL 
NMLLDKGAVKTKNCFFEIIKPFDKYIMRLQDRIiKSVTPLLMAC 
NAYELSVKMKTLSNPLDLALAL2TTNSLCRKSLALLGQTFSLAS 
S FRQE KI L * AVGLQD I APS PAAFPNFEDSTLFGRBY IDHLKAWL 
VSSGCPLQVKKAEPEPMREEEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVIEGSLSPKERTLLKEDPAYWFLSDEN 
SLEYKYYKLKLAEMQRMSENliRGADQKPTSADCAVRAT^LYSRAV 
RNLKKKLLP\WQRRGLLRAQG\IjRG\WKARRA\TTGTQTLLFLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPS PKPAGVDISEAPQTSS pcpsadidmkdngrtae 
KLARFVAQVG\PEIEQF\SI\ENSTDNPDLWFL\HDQNSS\AFK 
FY\RIOCVFELCPSICFTSSPHNl4\HTGGGDTT\GSQESPVDL«E 
GEAEFEDE PPPREAELES PEVMPEEEDEDDEDGGEEAPA\ PGRG 
GPS LEGS T P ADGL PG EA\ AEDDL/ AliGAPALFTGLLQVTCFP FG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKDLDFAQQKL\TDK\NLGFQ\MLQKMGWKEGHGLGSLGK 
GIR\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQL 
IFVF 


5875 


296 


' 184 8 


laaLgglplwrlsrrgfreyllglsapsalggamrsvsyvqrva 
lefsgslfphaiclgdvdndtlnelwgdtsgkvsvyknddsrp 
wltcscqgmltcvgvgdvcnkgknllvavsaegwfhlfdltpak 
vldas ghhetl i geeqrp vfkqh i pantkvml i sdidgdgcrel 
wgytdrvvrafrweelgegpbhltgqlvslkkwmlegqvdsls 
vtlgplglp e lmvsq pgcayai llctwkkdtgs ppas egptdgs 

/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H*Histidine, l«lsoleucine, K=Iiysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, QsGlutamine, R=Arginine, 
S-Serine, T=Threonine , V«Valine, 
W«Tryptophan, Y-Tyrosine, X«Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 



GLFALCTLDGTbKLMEEMEEADKLLWSVQVDHQLFALEKLDVTG 
NGHE BWACAWDGQT Y I IDHNRTWRFQVDENIRAFCAGLYACK 
EGRNSPCLVYVTPNQKIYVYWEVQLERMESTNLVKLLETXP\ST 
TACCRSMAW1LTTSL*LVPCPTKRSTIQTSHHSVLPQASRIPPS 
WTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ 



"S87~ 



«LPLGVPSKVauaaAMEPQEERETQVAAWLKKIFGDHPI?QYEV 
NPRTTEILHHLSERNRVRDRDVYLVIEDLKQKASEYESEAKYLQ 
DLLMES VNFSPANLS S TGSRYUJALVDSAVALETKDTS LAS F I P 
AVNDLTSDLFRTKSKSEEIKIELEKLBKNLTATLVLEKCLQEDV 
KKAELHLSTER\AKVDNRRQNM\DFLKAXSEEFRFGIQAAGEQIi 
SARGQ\DAFSVPIQSLVALIRENWPRLKQQTIPLK\KKLESYLD 
LMP\NPSHCSK*RIEEAK\RELA\SIEAELTRRV9\MMEIj 



5878 



950 



"2Tir 



GTLGKMAASSSGSKEKERLGGGLGVAG GNSTRERLLSALEDLBV 

lsreliewlaisrnqkllqageenqvlellihrdgefqelmkla 
lnqgkihhemqvlekevekrdsdiqqlqkqlkeaeqilatavyq 
akeklksiekarkgaisseeiikyahrisasnavcapltwvpgd 
prrpyptdlemrsgllgqmnnpstngvnghlpgdaia/rrkiar 
cpcsrvs/ngsqmtcr»iwiililqksvcei, 



5879 



981 



5880 



1138 



1324 



glwkcmqlqgphthrvqp*ptprqqgpq\vpvaviagnrpnyly 
rmlrsllsaqgvspqmitvfidgyyeepmdwalfglrgiohtp 

1SIKNARVSQHYKASI.TATFNLFPEAKFAWLEEDLDIAVDFFS 
FLSQS I HLLEEDDS LYC I SAWNDQG YEHTAEDPALLYRVETMPG 
IX3WVLRRSLYKEELEPKWPTPEKLWDWEWWP4RMPEQRRGRECII 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MEKDDDFTTWTQIJUCCJLHIWDLDVRG^GLWRLFRKKNHFLVV 
GVPASPYS VKKPPSVTPI FLEPPPKEEGAPGAPEQT 



RLTEAAAAGS(iSKAAGWAGSPPTLLPX,SPTS PRCAATMASSDED ' 
GTNGGAS EAGE DREAPGKRRRIjGFLATAWLT F YD IAMTAG WL VL 
AI AMVRF YMEKGTHRGIjYKS IQKTLKFFQTFALLE I VHCIil G I V 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEBSWLFLVAWTVT 
EITRYSFYTFSLLDHLPYFIKWARYNFFIILYPVGVAGELLTIY 
AALPH VKKTGMFS IRLPNKYNVS FDYY YFLLITMAS YI PLFPQL 

YFHMLRQRRKVLHG\G+L*KRMIK*SLQTRCFFQNNQDYLSPSF 
NNKNKQLCEISWIVWFLKI 



5881 



"26" 



441 



SliWCLVAGGLGLGPSSQNPLQRAGILA RPREARGTFSALTACSA 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
* KKKRGRCSS / WLSQPQHEREKEWLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHIAKCLDERQHAQRNVGBRSP 
DQSEHTDGHTSVQSVI EKLQEENRLLKQKVTHVEDLNAKWQRYN 
ASRDEYVRGLHAQLRGLQIPHEPELMRKEISRLNRQLEEKINDC 
A3VKQE LAASRTARDAALE R VQMLE Q Q I LAYKDD FMS ERADRER 
AQSR I QELEEKVASLLHQVS WRQDSREPDAGRIHAGS KTAKYLA 
ADALELMVPGGWRPGTGS QQP E PPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEEIjLRHVAECCQ 



GGIHPSPTEAPKAQHLTMDCT WRtLFLVAAATGTHAQVQLLQSG 

SEVKKPGASVMVSCYVSGYTLTKLSMHWVRQAPGKGLE*MGPFD 

LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLEIiSSLR3EDTAV 
HHCATDTV 



SGCVEMLYSH5LEYNPEWISVQSAVAPAQLALNSDGDL*LHSGE" 

RTRRD*QLPSAGGPGLQEPIiQLGELDITSDEFir,DEVDG\VDr,R 

HYSKQVELELQQIEQKSIRDYIQESENIASLHNQITACDAVLER 

MEQMLGAFQSDLSS ISSEIRTliQEQSGAMNIRLRNRQAVRGKIiG 

ELVDGLW PS ALVTA1 LEAPVTEPRFLEQLQELDAKAAAVREQE 

ARGTAACADVRGVLDRLRVKAVTKIRBFILQKIYSFRKPMTNYQ 

I PQTALLKYRFFYQFLLGNERATAKE1RDE YVETLSKI YliSYYR 

SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTIF 

TLGTRGSVISPTBIjEAPILVPHTAQRGEQRYPFEALFRSQHYAL 

LDNSCREYLFICEFFWSGPAAHDLFHAVMGRTLSMTLKHLDSY 

iju)cydaiavfi/;ihivlrfrnxaakrdvpaldryweqviallw 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

1 amino acid 

1 sequence 


Amino acid segment containing signal peptide - " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H=Histidine, I=*Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=»Glutamine, R-Arginine, 
S -Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion) 








PRFEL I LEMNVQS VRSTD PQRI/SfiliDTRPHY I TRRYAEFS S AL V 
S INQT I PN ERTMQLLGQLQVE VEN FVLRVAAE FSSRKEQLVFL I 
NN YDMMLGVLM \ E * ERAADDS KBVES FQQLLNARTQEF I EELL S 
PPFGGLVAFVKEAEALIE RGQAERLRQE EARVTQIilRGFGSS WK 
S S VES LSQDVMRS FTNPRNGTS 1 1 QGALTQLI Q\ LYHRFHR V\ L 
SQPQLRALPARAELINIHHLMVELKKHKPMF 


5883 


2 


1374 


EFPGRRFRAVMEAGAGAGAGtAAGWSCPGPGPTVI^LGSVEASEG 
CERKKGQRWGSLERRGMQAMEGBVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEIi 
EETRELAGQHEDDSLELQGLL3DERLASAQQAEVFTKQIQQLQG 
ELRSLREE ISLLEHEKESBLKEI EQELHLAQAE I QSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSS E PS GS LGLSD YSGLQEELQELRERYHFliNEE YRALQESNS 
SLTGQLADIjESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
DPBMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 

EQRRLQRELKCAQNEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWI S ALLWCWWAETSS 


5884 


4261 


2522 


GVLARAS ARLRVPLTG VRACAE PE VGAE PAKVAGAAEPDE^GGR 
SRLRD CGD YT PS ERLG PKOAMLW FQGAI PAAIATAKRSGAVFVV 
FVAGDDEQSTQMAASWEDDKVTEASSNSFVAIKIDTKSEACLQF 

sqiypwcvpssffigdsgipleviagsvsadelvtrihkvrqm 

HLLXSETS VANGSQS ESS VST PS ASFE PNNTCENSQSRNABLCE 
IPSTSDTKSDTATGGESAGHATSSQEPSGCSDQRPAEDLKIRVE 
RLTKFCT.EERREEKRKEEEQREIKKEIERRKTGKEMLDYKRKQEE 
ELTIGIMLEERNREKAEDRAARERIKQQIALDRAERAARFAKTKE 
EVEAAKAAALLAKQAEMEVKRESYARERSTVARIQFRLPDGSSF 
TNQFPSDAPLEEARQFAAQTVGNTYGNFSLATMFPRREFTKEDY 
KKKLLDLELAPSAS WLLP/ ALF INF * AGRPTAS I VHSSSGDI W 
TLLGTVIiYPFLAIWRLISNFLFSNPPPTQTSVRVTSSEPPNPAS 

SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 


5885 


900 


467 


aagggrrsrlsrswptgpskspsgvrccg\rr\awedxdefldv 

IYWFRQI IAVVLGVIWGVLPLRGFLGIAGFCLINAGVLYL YFSN 

ylqideeeyggtweltkegpmtsfa/ivhghldhllhchpl*lm 

VYSSQVLPIQS KGPS 


3O0Q 

588? " 


85 | 
1937 4 


1341 


pfrgraltlkkqprpgvappslgtchksdpgrpaaqsqppspgs 
gtfgllsfrmvrtktwtlkkhfvgyptnsdfelktselpplkmg 

EVLIjEAIiFIjT VDP Y^VAAXRLK EGDTMMGQQVAKVVES KNVAL 
PKGTI VIAS PGWTTHS I SDGKDLEKLLTEWPDT I PLSLALGTVG 
MPGLTAYFGLLE I CGVKGGET VMVNAAAGAVGS WGQI AKLKGC 
KWGAVGSDEKVAYLQKLGFDWFNYKTVESLEETIiKKASPDGY 
DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
PPEIGIYQELRMEAFVVYRWQGDARQKALKDLLKSJVLELPYFVI 

D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
KMPAAFMGMLKGDNLGKTIVKA 






104 

: 

{ 
\ 


APGCRGgRATRCPCRGPRWDSLGDEAAKS^AAPGGAPGLLGLRE 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 
PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACSVPWTGDSQFCSQKAVIYSIiNFTANPPQRVFELVDQINPSI 
FCIHITN\*NLHYPI*LIQKYL/NENNFDTLMJCrSDGFTLMAESY 
vap * AR " u ^^AAAKiExGVPLO/TSDSFLRFPSSLTSSIiCTDNNP 
AAFLVNQAVKCTRKI NLEQCE EI EALSMAFYSSPE I LRVPDSRK 
KVPIWQSIVlQSUSKTLTRREDTDVLQPTIiVNAGHFSLCVNVV 
LEVKYSLTYTDAGBVTKADLSFVLGTVSSWVPLQQKFEIHFtiQ 
ENTQPVPLSGNPG YWGLPLAAGFQPHKGSGI IQTTNR YGQLT I 
LHSTTEQDCLAJ^GVRTP^FGYTMQSGCKLRLTGALPCQLVAQ 
fCVKSLLWGQGFPDYVAPFGNSQGP/ADMLDWVPIHFITQSFNRK 
DS CQLPGALVI E VKWTKYGS LLNPQAKI VNVTANLI SSSFPEAN 

3GNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 
/ 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid- 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, Bo 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, NfeAeparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S -Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XaUnknown, *=Stop 
Codon, /^possible nucleotide deletion, ' 
\=possible nucleotide insertion) 


5888 


375 


2302 


LLCRTPGVAM<iRAl)SEQPSkRPfeCt)DSPRTPSNTPSAEADWSPG" 
LELH PDYKT WG P EQ VCS FLRRGGFEEP VLLKN IRENE I TGALLP 

CLDESRPENLGVSSIXSERKKLLSYIQRLVQIHVimiKVINDPIH 
CHIELHPLLVRIIDTPQFQRLRYIKQLGGGYYVFPGASHNRFEH 
SLGVG YLAGCLVHALGEKQPELQISERDVIiCVQ IAGLCHDLGHG 

PFSHMFDGR FT PTJVRPPVTfMTWP(V^C\7MM1?T7trr tutomti t 

QYGLI PEED I CFI KEQ I VG PLE S PVEDSLWP YKGRPENKS FLYE 
IVSNKRNGIDVDKWDYPARDCHHIjGIQNNFDYKRFIKFARVCEV 
DNELRICARDKEVGNLYDMFHTRNSIiHRRAYQHKVGNl IDTMIT 
DAFLKADDYIE ITGAGGKKYRISTAIDDMEAYTKLTDNI fleil 

ystdpklkdareilkqieyrnlfkyvgetqptcqikikredyes 
lpkevasakpkvlldvxlkaedf I vdvinmdygmqbknp I DHVS 

FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\ZiIRVYCKKVDRKS 

LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKT PTPTiPPPT.pvcdiA m irvT^ntJM 


5889 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRDIOPYPPQVFPPRPDRVAI 

vtggtdgigystakhlarlgmhviiagnndskakqwskikeet 

LNDKET * VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLIjK 

fgifil\di^sfrrsirqfvqkfkmkki?lhvlinnagvmmvpqr 
ktrdgfeehfglnylghflltnllldtlkesgs pghsarwtvs 

SATHY VAELNMDDLQS S ACYS PHAAYAQSKLALVLFTYHLQRLL 
AAEGSHVTANWDPGVVNTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAWTS IYAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLOQQ 
LWSKSCEMTCVT.nVTT. 


5890 

t 
1 

i 


13^2 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS 
GRPD PRPASAAGGHAGERMS QRDTLYHLFAGGCGGTVGAI LTCP 
LEWICTRLQSS S VTLY ISE VQLtnMAGAS VNRVVSPGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDS TQ VHM I SAAMAG FTAI TATNPI WL I KTRLQL * /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMENDEES VKEASDP VGMMI*AAATS K\ I*VATT I 

AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRO I P\NTATMMATVPT.WVT.T.Mra 


5891 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS"" 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAIIiTCP 
LEWKTRLQSSS VTLYI SEVQLNTMAQAS VNRVVS PGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMENDEES VKEASDFVGMMIiAAATS K\LVATTI 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
L VRQ I P \NTAI MMAT YELWYLLNG 


5892 


17^4 


379 


VVLRVCGRLSVNSAVSSRTGGWSAGLTCAMQRLQVVLGriLRGPA 
DSGWMP0AAPCljSGAPHARAaDVV\n7HfiPPTnTr^Br , or , r»i7*irTvii 

TPDELLSAVMTAVLKDVNLRPEQLGDICVGNVLQPGAGAIMARI 
AQFLSDIPETVPLSTVNRQCSSGIiQAVASIAGGIRNGSYDIGMA 
CG VESMS LADRGN PGN I TS RLME KEKARDCL X PMG I TSENVAER 
FGISREKQDTFALASQQKAARAQSKGCFQAEIVPVTTTVHDDKG 
TKRSITVTQDSGIRPSTTMEGLAKLKPAFKKDGSTTAGWSSQVS 
DGAAAI LLARRSKAE EIjG LP I LGVLRSYAWGVPPD I MGIGPA Y 
AI P VALQKAGLTVSDVDI PEINE \AFAS QAAYCVEKLRLPP * EG 
+ T P LGGAS GP *GHPIX3LHWGHVQVI TtAQ * S *S ARGKRAYRSGC 
PCAIGSWNGSPLPVFEYPWGT 


5893 ■ 


3 .» - 


1653 


ILSKRRCQKAKTKELMAKKVAVIGAGVSGLISLKCCVDEGIjEPT 
CFERTED IGGVWR FKENVEDGRAS I YQSWTNTSKEMSCFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKVVTQSNGKEQS AVFDAVMVCSGHH I LPH I PLKS FP 
GMERFKGQ YFHSRQ YKHPDGFEGKR I LVIGMGNLGSD IAVELSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
TAVKWMIEQQMNRWFNHENYGLBPQNKYIMKEPVLNODVPSRLL 
CGAI KVKSTVKEIiTETS AX FEDGTVE ENI D VI I FATGYS FS FPF 
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SEQ 
ID 
NO: 


1 Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A-Alanine, CsCysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, VsValine, 
W-Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSLVKVj^r^SLYKYIFPAHLDKSTIACIGLIQPLdgiFPl 5- 
AE LQAR WVTR VFKGLCS LPS ERTMMMD I IKRNEKRIDLFGESQS 
QTLQTNYVDYLDELALEIGAKPDFCSLLFKDPKLAVRLYFGPCN 

SY*YMiVGPGQWEGARNAIFTQKQRILKPLKTRALKDSSNFSVS 
FIiLKILGLLAWVAFF\ CQLQWS 


5894 


174 


1673 


RYS PKKVLQNKESS LKLGMATALVS AHSLAPLNLKKEGLRWRE 
DHYSTWEQGFKLQGNSKGEiGQEPLCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLI I LPKELQARVQEH 
HPES REDWWLBDLQIiDLGETGQQVDPDQ PKKQKILVEEMAP L 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIWTDSCGRVES 
SGKISEPMEAHNEGSNLERHQAKPKEK1EYKCSEREQRFIQHLD 
LIEHASTHTGKKLCESDVCQSSSliTGHKKVLS*ERKVlQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 
IHTGEKPYLCIHCGKNFRRSSHLNRHQRIHSQEEPCBCKECGKT 
FSQALLLTHHQRIHSHSKSHQCNECGKAFSLTSDLIRHHRIHTG 
EKPFKCNICQKAFRLNSHIAQHVRIHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5895 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGB 
KRLF VS DGVPGCLP VLAAAGRARGRAE VL I S - VG PEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAAIiYYIi\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADXVbWGALYPLLQDPAYLPEELSALHSW 
FQ/rLSTQ\EPCQR\AARRLVLKQ\OGVLAI*R\PYLQKQPQPSPA 
EGKGLSPIEPE3EELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERWVLITSALPYVKNVPHLGNIIGCVLSADVFARYS 
RLRQWNTL YLCGTDE YGTATETKAL\ EEGLTPQE I CDK YHI I HA 
DIY\RWFWISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVtiQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYBEARGDQCDKCGKXI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQF I TPFFGFREWPS KPRWO*TRDT, ir\ urndpt d* i? 

GFEDK\VFYVWFDATIGYLSITANYTDQWBRWW \ KNPEQVDL YQ 
FM \ AKDNVPFHS L VFPSSAI/3AEDN YTL \ VSHL I ATEYLN YEDG 
K\ FS KS RGVG VFRDM \ AHDTGI P PD I SRFYL\L YI RPEG K\DS A 
FS WTDLLLXNNS \ ELLNNLGNFINRA\GMF VSKFFGG\ Y VPEMV 

LTPDDQRIiLA\HVTLELQII yiiq\llekvrirdalrs ilti s \rh 
GNQYI\QVNEPW\ECRIKGSEADRQRAGTVTGLAVNIAAIiLSVML 
QPYMPTVSAT IQAQLQLP PPACS I LLTNFLCTLPAGHQI GTVS P 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWBTVTTAKPQQIQA 
LMDEVTKQGNIVRELKAQKADKNBVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5896 


29<S7 


86 


HPSliGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE" 
MRLFVSDGVPGCLP VLAAAGRARGRAEVIjI STVG PEDCWP FLT 
RPKVP VLQLDSGN YLFSTS AI CRYFF \LLSGWEQDDLTNQWLEM 
EATELQPTLSAALYYIj\ VVQGKKG \EDVLG S VRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTARffiKGI.ESIiPPLRPQQ 
KPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEIC!DKYHI IHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\ LADRFVEGVCPFCG YEBARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPVVQS5QHLFLDLPKLBKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFRBWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYWFDATIGYLS I TANYTDQWERWW\ KNPEQVDL YQ 
FM\ A KDNVPFHS LVFPSSALGAEDNYTL \VSHL IATE YLN YEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYI»\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
5NQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLS VML 
QPYMPTVSAT IQAQLQLPPPACS ILLTNFLCTLPAGHQIGTVS P 
LFQKLENDQIBSLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, Q=Glycine, 
H=»Histidine, Ialsoleucine, K*Lysine, 
T.=» Leucine, MaMethionine, N=Asparagine, 
■ P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine , V-Valine, 
W»Tryptophan, Y-Tyrosine, X«Unknown, *=*Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LMDE VTKQGNI VRELKAQKADKNE VAAE VAK LLDLKKQLAVAEG 
KP P EAPKGKKKK 


5897 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
fIRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPPLT 
RPKVPVLQLDSGNYLFSTSAlCRYPF\LLSGWEQDDLTNQWIiEW 
EATELQ PTLS AAL YYL \ WQG KKG \ EDVLGS VRRTLTH I DHS LS 
RQVNCPFLAGETESLADIVLWGALYPLLQDPAYLPBELSAIiHSW 
FQTLSTQ\E PCQR\AARRLVLKQ\ QGVLALR \ PYLQKQPQPS PA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLBSLPPLRPQQ 
NP VLPVAGE RNVL I TSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYL CGTDE YGTATE TKAL\ EEGLTPQEI CDKYHI IIIA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLR CEHCAR F\ LADRFVEGVCP FCG YE EARGDQCDKCGKLT 
NAVELKKPQCKVCRSCPVVQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDIjK\WGNPGTP*E 

gfedk\vfy vwfdatigyls i tanytdqwerww\knpeqvdlyq 
fm\akdnvpfhslvfpssalgaednytl\vshliateylnyedg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpbgk\dsa 
fswtdlllknns\ellnnlgnfinra\gmfvskffgg\yvpemv 

LTPDDQRLLA\HVTLEIiQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQ YI \QVITE PW\ KR I KGSEADRQRAGTVTGLAVNIAALLSVML 

qpymptvsatiqaqlqlpppacsilltnflctlpaghqigtvsp 
lfqklbndqieslrqrtgggqaktspkpavvetvttakpqqiqa 

LMD3VTKQGNI VRELKAQKAD KNE VAAEVAKTiIjDLKKQLAVAEG 

kppeapkgkkkk 


5898 


29S7 


86 


hpsllgaipfypppsspwppplylfwnshrksrhfinqrgihge " 
mrlfvsdgvpgclpvlaaagrargraevli stvgpedcwpflt 
rpkvpvlqldsgnylfstsaicryff\llsgweqddltnqwlew 
e at e lq ptls aal yyl \ wqg kkg \ edvlgs vrrtlth i dhs ls 

RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSAIiHSW 

fqtlstq\epcqr \aarrlvlkq\qgvlalr\ pylqkqpqpspa 
egkglspiepeeeblatlseeeiamavtamekgleslpplrpqq 
npvlpvagernvlitsalpyvnnvphlgni igcvlsadvfarys 

RLRQWNTLYLCX3TDEYGTATETKAL\EEGLTPQEICDKYHI iha 

diy\rwfnisfdifgrtttpqq\tkit\qdifqqllkrgfvlqd 

TVEQLR CEHC7VRF\LADRFVEGVCPFCGYEEARGDQCDXCGKLI 
mVELKKPQCKVOlSCPWQSSQHLFIiDLPKLEKRLBEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 

fm\akdwpfhslvfpssalgaednytl\vshliateylnyedg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpegk\dsa 
fswtdu^lknnsNbli^nnlgnfinraxgmfvskffggXyvpe^ 

LTPDDQRiIA\HVTLEIiQHYHQ\LLEKVRIRDALRflII,TIS\RH 
GNQYI \QVNEPW\KR IXGSBADRQRAGTVTGLAVNIAALLSVML 
QP YMPTVSATIQAQLQLPPPACS H^TNFLCTLPAGEQIGTVSP 
LFQXLENDQI ESLRQRFGGGQAKTS P KPAWE TVTTAKPQQIQA 
LMDE^KQGNIVRELKAQKADKNEVAAEVAKLLDIJCKQliAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


NCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ 
EANEKAEBIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKILMSTWR^QAPOiKVLRARNDLISDLI^EAXLRLSRIVEDP 
EVYQGLLDKLVLQGLLRLLEPVMIVRCRP\QDLLLVEAAVQKAI 
PEYMTISQKHVEV\QIDKEA*LAVECSWEVWEVYSGNQRIKVSN 
TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


£4 


1409 


KAASRDSPCLEFCPLCGVSSHDLQHRMWYHRLSHLHSRLQDLLK 
GGVIYPALPQPNFKSLLPLAVHWHHTASKSLTCAWQQHEDHFEL 
KYANTVMR FDYVWLRDHCRS AS CYNS KTHQRSLDTAS VDLC I KP 
KTIRU5BTTLFFTWPDGHVTKYDLNWLVKNSYBGQKQKVIQPRI 
LWNAE I YQQAQVPS VDCQS FLETNEGLKKFLQN FLLYG IAFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTT YFQEP CG I QVFHCLKHEGTGGRTLLVDG FYAAEQVL 
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SEQ 
It> 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E=s 
Glutamic Acid, Faphenylalanine, G»Glycine, 
H»Hi 8 1 i dine , I - 1 sol euc ine , K=Lysine , 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T-Threonine, V« Valine, 
W=Tryptophan, Y-Tyrosine, X=» Unknown, *ostop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPEEFELbSKSAI \KHEYIEDVGECHQPHDWDWAQS* i6tHG 
/YKE LY LI RYNNYDRAVIwrvPYDWHRWYTAHRTLTI ELRRPE 
NEFWVKLKPGRVLFIDNWRVLHGRECFTGYRQLCGCYLTRDDVL 
NTARLLGLQA 


5901 




2121 


VAIEQTSliKMKQAVGGAPARPTGEYICNQCGAKYTSLDSFQTHL 
KTHLDT VLP KLTC PQ CNKE FPNQESLLKHVTI HFMITSTYYICE 
SCDKQFTSVDDLQKHLLDMHTFVFFRCTLCQBVPDSKVSIQLHL 
\AVKHSNEKKVYRCTSCNWDFRNEOTIiQLHVKHNHLENOGKVHK 
CI FCGBS FGTEVELQCHI TTHS KKYNCKFCS KA FHAI ILLEKHL 
REKHCVFETKTPNCGTJJGASEQVQKEEVELQTLLTNSQESHNSH 

nf2QPWnVPrrCC*TlMVr!f , r»T/^ ,, li &vtmc?t VJ>\Tjr\7 nnuurnn/trin 
u\aa CiCtU vuio Ctl'Pl luLUi LyAAl 1 Ma 1 -b-LnJ N-~iW -uKUHN I RJr G E 3 

AI VKKKAELI KGNYKCNVCS RTFFS ENGIjREHMQTHLGP VKKYM 
CP I CGER FP SLLTLTEHKVTHS KSLDTGN CR I CKMPLQSEEEFIj 
EHCQMKPDLRNSLTGF^CVVCMQTVTSTLELKIHGTFHMQKTGN 
uoAVuilfeKuynVUi>X)XK.(-LAo CIiKe.rl<bKuDI>VKiiDINGIiPYGL 

cagcvnlsksaspginvppgtnrpglgqnektlsa i egkgkvggl 
ktrcs*latfkf*vlkvelpephpkpfhrgvsrpdsnstqlktp 
qvs pmprispsqsdekktyqci kcqmvfynewdiqvhvanhkid 
eglnhecklcsqtfds paklqchl IEHS FEGMGGTFKCP VCFT v 
FVQANKLOQHIFSAHGQEDKIYDCTQCPQKFFFQTEI^NHTMTQ 
HSS 


S902 


712 


209 


LKNRRRSRPS IRQS IGSTSVSRWLTSLFTYLDHTADVQ* V*RBF " 
I PLXPRQ* ED*MFQSWLHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VEPLQTVKVETQGDDLQS LLFHFLDEWLYKFSADEFFI P \GWGE 
EFSLSKHPQGTBVKAI TYSAMQVYNEENPEVFVI IDI 


" S903 


2T0g 




DT PGPSLPS TTAP FS LRS LS FPS RPS YLIiPGDPQPLQGRGLPTT 
PALFALSAVPGGAAS PMP PSGIiRLLPLLLPLLWLLVLTPGRPAA 
GLSTCKTI DMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGP 
LPEAVLALYNS TRDR VAGESAEPE PE PEAD YYAKEVTRVLM VET 
HNE I YDKFKQS THS I YMF FNTS ELREAVPE PVLLSRAEIiRLIiRL 
KLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVrGVV 
RQWLSRGGE I EGFRLS AHCSCDSRDNTLQVDINGFTTGR\RGDI* 
ATIHGMiroPFLLI^TPI^RAQHLQS\SRHRQAL\DTNY\CFSF 
HGGRNCLRC/VHC*HH FRKDL\GW\KW1 \HE\ PKOYHANFC\L 
GPCPYIWSLDTOYSKVLAIjYNOXhicpoV ARAAPVprvDnaT tjdV 
LPIVYY\VGRKPKVEQLSNMIVRSCKCS 


~S904 


3 


112$ 


MMBEIENAINTFKBEQRLI YEELIKEEKTTNNELSAISRKIDTW "" 
ALGNS ETE KAFRAI S S KVP VDKVTP S TLPE E VLD FE KPLQQ TGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVIjEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKESIQIWKTKKQQKREEIFKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKSIEMSMKCASQL 
KEEE EKEKKHQKERQRQFKLKLLLES YTQQKKEQEEFLRLE KE I 
RBKAEKAEKR KNAADE I SRFQERDLHKLELKI LDRQAKEOE KSQ 
KQRRliAKLKE KVENNVS RD PS RLY/NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 

- 


2912 


MASFPPR^EKEIVRLRT^GELIAPAAPFDKXCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGR I KI WDVYTGKXLLNLVDHTG WRDL 
TFAPDGSLI LVSJVSRDKTLRVWDLRDDGN\MMKVliRGHQN WVY\ 
SCAFS PDSSMLCS VGASKAWAAILV* LRLCWHHSHT3ATMVLS 
WAE RVAS LATGLGATFT IG * SNLAFVIiQG VL YVHRCWSMSTFCF 
SFFLFFFFKVISPTVKYH* LLSKLIFQFYGIGSLTSETNLM *S I 
WLSNGFS VLFFGILSDSRDI LRL* FNLKFVLIFF * K* CIVS VQK 
KKKPKR IALLQEERLS *DKPPSSHLI *QTEVNIRILFRAILHS * 
LLIFRI *NCI *TYS * IIDPFYIQMTYDRG*FGKNKMVKF*FIEM 
* LYYFHXIAFSFCNW*HPCCLPKKFHLAVNILFACS ICFSS * A 
QVGD P S LL* TSDYLKGRCQWSNNLLTLR FLSVYF FKNL WSGKK 
REGGL* YLTLFISVYFS *LVFGINGFQYS FWKLHCLYFMFRLI 
FKLTFNRNI *NRI CMSALINLKTDFNLTMTLSIFFKLLI I YNA* 
YNLN * I * QF* YKMCHFVLCMS E *S YNI CLFI AGF\ LWNMDKYTM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R^Arginine, 
S^Serine, T=Threonine , VoValine, 
W=Tryptophan, Y*Tyrosine, X=Unknovm, *«Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) 








IRKLBGHHHDWACDFSPDGALLATASYDTRVYIWDPHNGDILM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FWRIDEDYPVQVAPLSNGLCCAFSTDGSVLAAGTHDGSVYFWAT 
PRQVPSLQHLCRMS IRRVMPTQEVQELP I PSKLLEFLS YRI 


5906 


146 


203 8 


REGAGSGRMASGA\YNPYIEIIEQPRQRGMRFRYKCEGkSAGSI 
PGEHSTDNNRTYPS I QI MNYYGKGKV\RITLVTK\NDPYKPHPH 

dlvgkdcrd\gyyeaefgqe\rrp\lffqn\lgircvkkkevke 

A\ I ITR\ 1 KAG INPFDVP*KQIiNDIEDCD1iDVVRLWFRVFLPDG 
HGNL\TTALPPV\VSSPI ydnrapntaelrvcrvnkncgsvrgg 
de i flilcdkvqkddi evrfvlndweakgi f5qadvhrqvai vfk 
tppyckaitepvtvkmqlrrpsdqevsesmdfrylpdekdtygn 

KAKKQKTTLIiFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 

qsagitvnfperprpgllgsigegryfkkepnlfshdavvremp 
tgvssqaesyypspgpissglshhasmaplpssswssvahptpr 
sgntnplssfstrtlpsnsqgippflripvgndlnasnaciynn 
addivgmeassmpsadlygisdpnmlsncsvnmkttssdsmget 
dnprliismnlenpscnsvldprdlrqlhqmssssmsagansntt 
vfvsqsdafegsdfscadnsminesgpsnstnpkshvfvqusqy 
sgigsmqneqlsdsfpyeffqv 


5907 


99 


1373 


TYLX,SSWSS**NLDTk^KSQVkV/teiiG«kJKi:sWPVPOPAKQMGK 
KATS KVPSAPHFVHPNDHANREAELKKKWVE EMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQEIiNMFPQLDDEATRK 
AYYIG5FRKVVEYSDVILEVLDARDPLGCRCFQMEEAVLRAQGNK 
KLVLVLNKIDLVPKE WE KWLD YLRNELPT VAFKASTQHQVKNL 
NRCSVP VDQASESLLKS KACFGAENLMRVLGNYCRLGEVRTHIR 
VGWGLPNVGKSSIiINS LKRSRACS VGAVPG I TKFMQE VYLDKF 
IRLIiDAPGIVPGPNSEVGTILRNCVHVQKLADPVTPVETILQRC 
l^EEI SNYYGVSGFO/TTBHFLTAVAHRIjGKKKKGGL YSQEQAAK 
AVLADWVSGKISFYIPPPATHTLPTHLSAE1VKEMTEVFDIEDT 
EQANEDTMECIATGESDELLGDTDPLEMEIKIJJISPMTKIADAI 
ENKTTVYKIGDLTGYCT^PNRHQMGWAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMETDPLQQGO^J^ALKNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


hcgikkrgegsgspspasggfqlgcqipspslpseeet*KpMtra 
htrtlratltrrpprshstrlrfpmpldgdgglaswk/pmrer* 

GWRRPAKAAGASLGVAATGKRGCRMSKRYLQKATKGKLLI 1 1 fi 
VTLWGKWSSANHHKAHHVKTGTCEWAIiHRCCNKITKIEERSQT 

vkcscfpgqvagttraapscvdasiveqkwwchmqpclegeeck 
vlpdrkgwscssgnkvkttrvth 


5909 


1 


5002 


PAI PGSTI I WAPGSHSAARADGRHGSLPSQSQAPGALCGARAPP 
SSNLRADRSMICAQARAGKNLYHNRFLGLAAMAFPSRNSQSLRR 
CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 
STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMYLIDEVLS 
ENFLD YKNRGVNGS HRGQ 1 1 WKI DAS S YFVE PETKI CFKYYHGV 
SGALRATTPSVTVKNSAAPI FKS IGADETVQGOGSRRLI SFSLS 
DFQAMGLKKGMFFNPDP YLKI S IQPGKHS I FPAL PHHGQERRS K 
IIGNTVNPIWQABQFSFVSLPTDVLEIEVKDKFAKSRPIIKRFL 
GKLSMP VQJRLLERHAI GDRWS YTLGRRLPTDHVSGQIiQFRFB I 
TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGBEASALLLE 
DGE APAS TKEEPLEE EATTQSRAGREEEEKEQEEEGDVS TLB QG 
EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPBTPRTHYIR 
IHTLLHSMPSAQGGSAAEEEDGAEEESTLKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVBR 
S PEGtiES PVAGPSNRREGECP ILHNSQP VSQLP SLRPEHHH YPT 
IDE PLPPNWE ARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 
RSGSIQQMEQliNRRYONIQRTIATERSEEDSGSQSCEQAPAGGG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence ■ 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ■ 
(AaAlanine, (^Cysteine, D*Aspartic Acid, E- 
Glutamic Acid, Phenylalanine, G^Glycine, 
H«Histidine, lelsoleucine, K=Lysine, 
L«Leucine, M^Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknovn, *»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








GGGGSDSBAESSQSSLDLRREGSLSPVNSQKlTLbLQSPAVKFI 
TNPEFFTVLHANYSAyRVFTSSTCI,KHMlLKVRRDARNFKRyQH 
NRDLVNFINMFADTRLELPRGWEI KTDQQGKS FFVDHNS RATTF 
IDPRI PLQNGRLPNHLTHRQHLQRLRS YSAGEASEVSRNRGASL 
LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLQE 

EIMSYVPLQAAFHPGYSFSPRCSPCSSPQKSPGLQRASARAPSP 
YRRDFEAKLRNFYRKLEAKGFGQGPGKIKLI I RRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSRBFFFLIiSQEIiFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILGNliALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDEEFHQSLQW 
MKDNNITDILDLTFTVNEEVFGQVTERELKSGGANTQVTEKNKK 
EYIERMVKWRVERGWGQTEALVRGFYEWDSRLVSVPDARELE 
IiV I AGTAE I DLNDWRNNTE YRGSYHDGHIiVlRWFWAAVERFNNE 

QRIdlLLQFVTGTSSVPYEGFAAPPWEPMrir.»DT?T X) * trvtarairrrno 

L P PRG \ HTCLQPD WDL PT VS PRTPML YEK\LLTA\ VEETSTFGT 


5910 
5911"" 


1526 


446 


VAEFAAMEPGRTQIKLDPRYTADLLBVI,KTNYGIPSACFSQP?T 
AAQLLRALGPVELALTS ILTLLALGS 3 AI FLEDAVYL YKNTLCP 
I KRRTLLWKSSAPTWSVLCC?PGTiWT pptjr a/t wmtt'podvjii rr* 

FYIiLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ\R*CWALSNTPS*R*R*PWWACFSSPTASMTQQTFL 
RGACLYGSTLSSA/CSTLIALWTLGI ISRQARLHLGEQNMGAKF 
ALFQ VLL I LTALQPS I FS VLANGGQ 1ACS PPYS S KTR3QVMNCH 
LLILETFLMTVLTRTiYYRRKDIIin/nVRTPQQonT t%t xrr vtvt omu 
AWTMKGCCTH 


109 


595 


QijPLAPClQGKGIiEMRSPKPQS FI IRSSHSGAGLLVKNPSTPVf"" 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GTISAHCNbRLPSS SMS PAPAS *I»AGITGVCHHAQL1FVFL VET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5912 


924 


277 


MILNKALMliGAIiALTTVMSPCGGEDlVADHTASYGVNLYQSYGP " 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
IAVLKHNLNI VIKRSNSTAATNE VPEVTVFSKSPVTLGQPNTLI 
CL VDN I FP PWNITWLSMGHS VTEGVS ETRP ^SPKQnHjr r t n-nn 

VTSPSFPFE* *DL*TAKVEQLGAWFEPLLKHWGAEIPTTL 


5913 


46 


1198 


qlrmagaegaagrqselepvVslvdVLeedee-leneacavlggs 

DS EKCS YS QGS VKRQAI* YACSTCTPEGBEPAG I CLACS YECHGS 
HKLFELYTKRN FRCDCGNSKFKNLECKLL PDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEKVCQACMKRCS FLWAYAAQLAVTKI S T\GMMDWCGTLM 
B * /DDQEVI KPENGEHGDSTLKEDVPEOGKDDVRRVKVPnMQTJD 
CAGS S SESDLQT VFKNES LNAESKSGCKLQELFCAKQLI KKDTAT 
YWPLN WRS KLCTCQDCMKMYGDLDVL FLTDBYDTVLAYENKGK I 
AQATDRSDPLMDrLSSPTOVQQVELIC/GIQ*FED 


5914 


960 


124 


NLGGS ELP PEEALF 1 Q VASMNQRRVDFYIiAS I EDMIjVAI / GGRN 
ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNLLCYDHRTDW/EERRPMTTARGWHSMCSIjGDS 
I YS IGGSDDN I ESMERFDVLGVBAYS PQCNQWTRVAPLLHANSE 

sgvavwegriyilggyswentafsktvqvydreadkwsrgvdlp 
kaiaggsacfiap*slgqrtrkrkakargtrtgasdpscaswdh 
phrhl pglcrpaats 


5915 


J.DU4 


703 


fpgrptrplklgrpjikrariiqaphciisprprtcppgalqapea 
pasraegpvavwnghtegpaparsapkbppglprplgsfpcpt 
pqbdfpalggpcpprmppspgfsawllkgtppppppglvppis 
kpppgfsgllpsphp\pvspappppppqk/rprllpap/pglps 

PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPBGL* +AAGPAAH 


. 5916 
5917 """ 


256 

— 13^3 


<J33 

827 -p 


sprmweiwgpwhrwesfslegewpsripepspdstkgtsgkgcr 
rvtg avhrhlnhvag i ipwvlhsqlkptaataqdqwtsqqypdh 
pt:rlilq*nqatadknn*ttallqphqrl\vsprmaea 

ftHQILTYLEP/ICLWNYMKIIiTVFLTKSVLEl*KFIHTEiQTYR 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


riculitCcQ 6Z1CI 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A*Alanine, (^Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
ii-niBcxaine, isxeoxeucine, K»Lysine, 
L=Leucine, M»Methionine, N«Asparagine, 
P«Proline, Q-Glutamine, R=Arginine, 
S=»Serine, T-Threonine, V»Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possibla nucleotide insertion) \ 








?*NDFFGIKkW k VSKkLRKTSFy 5LAVTFLEQAVVSKECVPVDQ ' 
?M EHLL PS LLSLASD PVPNVRVLLAKALRQMLLB KAYFRNAGNP 
HLE VI EET I LALQSDRDQDVS F FAAIiE PKRRNI I DTAVLEKON 


5918 


13 


1247 


cuMy viUUCKa KKy WKAijKCORGRGGRRAERTGGRGP PGRPR PL P 
PGPARRGRRRMETPFYGDEALSGLGGGASGSGGTFASPGRLFPG 
APP rAAAGSMMKKDALTLSLSEQVAAALK P APAPAS Y p PA\ ADG 
A? S AAP PDGLLAS PDLGLLKLAS PELBRL 1 1 QSNGLVTTTPTS S 
QFLYPKVAAS EEQEFAEGFVKALEDLKKQNQIiGAGRAAAAAAAA 
AGGPSGTATGSAPPGELAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT\VAFAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALERESEDPS * SPEHGSLASTASLLREQVAQLK 
QKVLSHVNSG CQLLPQHQVPAY 


5919 


1 


4254 


TS^SUUTPTSSUGSlNMBHWIS^AiHCiSTTSlTSSSSTQSG 

GSGAAHRLADVMAQTHI ENHSAPPDVTT YTS EHS I QVERPQGST 

GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQIiVNTLKRPKRPP 

LRBFFVDDFEELLEVC^PDPNQPKPEGAQMLAMRGEQLGVVTNW 

PPS LEAALQRWGTI S P KAPCIiTTMDTNGKPLYI LTYGKIjWTRS M 

KVAYS I IjHKLGTKQEPM VRPGDRVALVF PNNDPAAFMAAFYGCL 

LABWPVP I EVPLTRKDAGSQQIGFLLGSCGVTVALTSDACHKG 

LPKSPTGE 1 PQFKGWPKLLWFVTES KHLSKPPRDWF\ PHIKDAN 

NDTAYI EYKTC K\DGS VLGVTVTRTALLTHCQALTQACGYTEAE 

TIVNVI^FKKDVGLWHGILTSVMNMMHVISIPYSLMKVNPLSWI 

QKVCQYKAKVACVKSRDMHWALVAHRDQRDINLSSLRMLIVADG 

ANPWS ISSCDAFLNVFQS KGLRQEVICJPCASSPEALTVAIRRPT 

DDSNQPPGRGVLSMHGLTYGVIRVPSEEKLSVLTVQDVGLVMPG 

AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 

FEVFAMTSSGAPISEYPFIRTGLLGFVGPGGLVFWGKMDGI*MV 

VSGRRHNADDI VATALAVEPMKFVYRGRIAVFS VTVIJUJER I VT 

VAEQRPDSTEEDSFQWMSRVLQA1DSIHQVGVYCLALVPANTLP 

KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 

PBIGPASVMVGNLVSGKRIAQASGRDLGQIBDNDQARKFLFLSE 

VLQWRAG/rTPDHILYTLLNCRGAIANSLTCVQLHKRAEKIAVML 

MBRGHLQDGDHVALVYPPG I DLI AAFYGCLYAGCVP ITVRPPHP 

CNI ATTLPTVKMI VEVSRSACIiMTTQL I CKLLRSREAAAAVDVR 

TWPLI LDTDD * P KKRPAQ I CKPCNPDTIjAYLDFS VSTTGMLAGV 

KMSHAATSAFCRS I KLQCELYPSREVAI CLDPYOGLGPVLWCLC 

SVYSGHQSILIPPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 

CTKGLGSQTESLKARGLDLSRVRTCVWAEERPR I ALTQS FS KL> 

FKDLGLHPRAVSTSFGCRVNIiAICLQGTSGPDPTTVYVDMRALR 

HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 

jjjcj.« vno/uifj Aaisit i ^XvyDBSIjQSDHFNSRLSFGDTQTIWAR 

TGYLGFLRRTELTDANGKRHDALYWGALDEAMELRGMRYHPID 

IETSVIRAHKSVTECAVFTWTNLLWWELDGSEQEALDLVPLV 

TNVVLEEHYLIVGVWVVDIGVIPINSRGEKQRMHIjRDGFLADQ 
LDPIYVAYNM 


5920 


1381 


1499 


UUJAVAHAGVSRI PP* LFPPLHPTFL3LWCLHHKLP / HPPGASM 

vrppwprrppahissvrqastqvprtvphtqrvanigtqttgp 
sgvgcctpgrpllpckcsSaahstyrvqepavhipgqepltasm 

LAAAPLHEQKQMIGERLYPLIHDVHTQLAGKITGMLLEIDNSEL 
LLMLES PESLHAKI DEAVAVLQ AHQAM E Q P KAYMH 


5921 
5922 


727 
2475 


157 " " 
495 


VCPGTGGE»GLWGQLGGLPKETPLKPMDAFTGSGLKRKFDDVDV^ 
GSS VSNS DDEIS SSDSADS CDS&NPPTTAS FTPTS I LKRQKQLR 
RKNVRFDQVTVYYFARRQG FTS VPSQGGSSLGMAQRHNS VRS Y J 
LCEFAQEQE VNHRE I LREHLKEEKtJIAKKMKIiTKNGT VESVEAD 
GLTLDDVS DEDI DVENVE VDDYFFLQPLPTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECX3CDCRLYCDPEACACSQAGIKCQVD 
RMSFPCGCSRDGCGNMAGRIEFNPIRVRTHYLHTIMKLELESKR 
3 \GAAQQPQ \ *GALPDCQLQPDRSTGL * DPS WIGS KGLS FTGKG 
AAATHLI ILRVTENRGAEGKRK 

SYSNWGI,FPSVFIQVPRSRTGNLKPIFLFYSYYE\CMETIiKG\T " 
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ID 
NO j 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, ^Threonine, VoValine, 
W«Tryptophan, Y-Tyrbsine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\°possible nucleotide insertion) 


5923 " 






CLYNATQYKVCSPRNDRPDACYNPSEPAATTVFEIRTGEEXSr- 
S K 1 1 TRTBE KE I PKQI TLRFDACAAINS KKLE I6C6SLN * ERS * 
RVENKYVCHESQVCKNCAYWPCVI * AT*KKNKNDSVYLQKGEAN 
PS CAAGHCNPLELI 1 TNPLDPHWKKGER VTLG INRTGL KPQ Wl 
LIKGEVHKCSPXPVFQTPYEELNLPAPELLKKTKNLPl^LAENV 
I FLLNGTS CYVRGGTTIGDRWPWEA*ELVPTDPAPDI 1PI * KAE 
ASNF* VLKTS I IRQYCXAREGKDFI 1 PVGKPNC IGQKLYNSTTK 
TIT* * DLNHTEXNPFS KFS KLKTA* AHAES H * DWTVPSGLY* IC 
RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKMGELLGFSVYASR 
EKKGIVIGNWKDNEWPRERUQYYGPATWAQDGSWGYR/TP/VY 
MLNWI IRLQAILEI ISNETGRALTVLAWQETQMRNAIYQNRLAL 
DYL LVAEGG VCRKFNLTNCCLQINDQGQVVKNI VRDMTKLAHVP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 
P^WIKGIVATLVHQKTSAHVNYMNHYRSISQRDSKSEDESE 


5924 


137 


638 


QLCGRRGQRFRTS ^ KRMHPI * RTCPflTNL/ 1 I'lLSQENTO IRbL 
QQBNREL WI SLEEHQDALELIMS KYRKQMLQLMVAKKAVDAE PV 
LKAHQSHSAEIESQIDRICEMGEVMRKAVQVDDDQFCKIQEKLA 
QLELENKELRELLS ISS ESLQARKENSMDTASQAIK 




274 


2146 


EKGKVKDAGAKQWISLSLSCKGSWETQFSNHI^SLTPPtSVRRM 
PLITTVTLLKMVARHHKKLLCSKAPS TQIiQQKI FLHSQMGIHHQ 
S VCMKLKPNTSHI I S ILMGQPMALVQLETIjAPLTI 1 1 QKFQTQD 
HMKFWKNLPLHSHHLTPS VPQTVI PKKTGSPBI KLK I TKTI QNG 
REI*FESSLCGDLbNRVQASE\Q*NQS IESRKEKRKKSNKKDSSR 
SEERKSHK1PKLEPEEQNRPNERVDTVSEKPREEPVLKEGSPSS 
ANTI FCSNNGS VHW \ FKFQVGDLVWSKVGTYF WWP CM VSSDPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REERlEQYTFIYIDKQPEBAIiSQAKKSVASKTBVKKTRRPRSVL 
NTQPEQTNAGE VASSLSSTE I RRHSQRRHTSAEEEEPPPVKI AW 
KTAAARKSLPAS ITMHfOGSLDLQKCNMS PVVKIEQVFALQNATG 
DGKF I DQ FVYSTKG I GNKTE I S VRGQDRL I ISTPNQRNEKPTQS 
VSS PEATSGSTGS VEKKQQRRS I RTRS ESEKSTE WP KKKI KKE 
QVGFLHVES 


5925 


2l<J 


1911 

i 


MMTAESREAl-GLSPQAAQEKDGIVIVKVEEEDEEDHMWGQDSTL 
QDTPPPDPEIFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQlIiELLVLBQFI#SILPKELQVWLQEYRPDSGEEAVTLLE 
DLELDLSGQQVPGQVHGPEMLARGMVPLDP VQESS S FDLHHEAT 
QSHFKKSSRKPRLLgSRALPAAHIPAPPHEGSPRDQAMASALFT 
ADSQAMVKI3DMAVSHLEEWGCQNLARRNLSRDNRQENYGSAF 
PQGGENRNENEESTSKAETSEDSASRGETTGRSQKEFGEKRDQE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGERGPREKGK 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRHK 
1XHTGEKPYECSECGKAF\SLNS\NLVLHQRI\HTGEKPHECNE 
CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
RIHTGEKPYECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 
\AFTRSSTLTIjHHR I HARERAS E YS PASLDAFGAFLK5CV 


5926 


2 


233 


URCLMLKQGSUPGSPPAT/CEPPAPPVYQAPCOSCPEPPGAHEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


4146 


1248 

] 
I 


KHFSKFGSQAJjYQLKRPASGQNS isvmpaqkitkpaakygipla 
YKKYGDK1CLHEKKPLQKHKQAHQTPEKRVNTGEERRKI SESAAR 
ivKKiiC r x cke KKUKUQ IIS LMKAEQMKRQBKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAI fdq 

mqqqraedneakmkreiygrglperqkgqlaverakqveeflqr 

KREAMQNKARAEGHMGILQNIiAAMYGGRPSSSRGGKPRNKEEEV 
YLARLRQI RLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKWEEHLV 
^GVKSSDVSPPIK3QHETGGSPSKC2QMRSVISVTSALKSVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRLNENLKAQEDEKGKQN 
L»S DTFE I NVHEDAKEHBKEKS VS SDRKKWEAG GQLVIPLDELTL 
3TSFSTTERHTVGEVIKLGPNGSPRRAWGKSPTDSVLKILGEAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
atr.ino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "I 
(A«Alanine, C=Cysteine, D=Aspartic Acid, B=» I 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HaHistidine, I«lsoleucine, K*Lysine, 
L=Leucine, M=Methionine, N-Asparagine, ' 
P»Proline, Q^Glutamine, R«Arginine, 
S^Serine, T=Threonine, VmValine, 
W«Tryptophan, Y^Tyrosine, X=Unknown, *=stop 
Codon, /-possible nucleotide deletion, 1 
\»possible nucleotide insertion) 








LQLQTELItENTTIRSE I S PEGEKYKPLITGEKKVflC'l SHE tNPS \ 
AIVDSPVE7KSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTWKDE \ S LPCTITD VWISEEfCETKBTQS ADR I TIQENE VSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPPPHKWHSE 
HLNLVPQVQS VQCS PEES PAFRSHSHLFPKNKNKNSLLIGLSTG 
LFDANNPKMLRTCSLPDIjSKLPRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVPEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GE I ASECECDS VFNHLEELRLELEQEMG PEKPFE VYE Kl XAIHE 
DEDEN I EI CS KI VQNI LGNEHQHLYAKI LHLVMADGAYQEDNDE | 


5928 


4146 


1248 


KHFSKFGSQaXj YQLKRPASGQNS I S VMPAQKI TKPAAKYG I PLA~ 
YKKYGDKKIiHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLE FIEKE KKQKDQ IIS LMKAEQMKRQEKBRLER INRARBQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSF5SRGQYEHYHAIFDQ 
MQQQRAEDNEAKWKREIYGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEBEV 
YliARLRQIRLQNPNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKG VKS SDVS P PLGQHETGGS PS KQQMRSVI S VTS AL KB VG VDS 
SLTDTRETS EEMQKTNNA I SS KRE I LRRLNENLKAQEDEKG KQN 
LSDTPEINVHEDAKEHBKEKSVSSDRKKWEAGGQLVIPLDELTL 
DTSPSTTERHTVGEVIKLGPMGSPRRAWGKSPTDSVLKILGEAE 
LQLQTELLENTTIRSE I S PEGEKYKPLITGEKKVQCISHE INPS 
AIVDSPVETKS PEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE \ S L PCT I TDVWI SE E KETKETQSADR I TI QENEVS EDG 
V5STVDQLSD I H I E PGTNDSQHSKCDVDKS VQPE P FFHKWHS E 
HLNLVPQVQS VQCS P EE S FAFRSHSHLP P KNKNKNS LLIGLSTG 
L FDANNP KMLRTCS LPDLS KLFRTLMDVPTVGD VRQDNLB I DE I 
EDENIKEGPSDSBDIVFBETDTDLQELQASMEQLLREQPGEEYS 
EEEE S VLKNSDVE PTANGTDVADEDDNPSSES ALNEE WHSDNS D 
GEIASECECDSVFNHLEELRLHLEQEMGPEKFFEVYEKIKAIHE 
DEDENIE I CS K I VQNI LGNEHQHI.YAK I LHLVMADGAYQEDNDE | 


5929 


3 


1558 


LDFSMTTQLPAYVA I LLF YVS RAS CQDT FTAA VYEHAAI LP NAT 1 
LTPVSREEALALMNRNLDILEGAITSAADQGAHI IVTPEDAIYG 
WNPKRDS LYPYLEDI PDPBVNW IPCNNRNRFGQTPVQ3RLSCL\ 
AKNNS I Y WANI GDKKPCDTSDPQCP PDGRYQYNTD WF\DS QG 
KLVARYHKQNLFMGENQFNVP KEPE I VTFNTTFGS FG I FTCPDI 
LFHDPAVTLVKDFHVDTIVFPTAWMNVLPHLSAVEFHSAWAMGM 
RVNFLASNIHYPSKKMTGSGIYAPNSSRAPHYDMKTEEGKLLLS 
QLDSHPSHS AWNWTS YASS I EALSSGNKE PKGTVFFDEFTPVK 
LTGVAGNYrVCQKDLCCHLS YKMSENI PNBVYALGAFDGLHTVE 
GR Y YLQ I CTLLKCKTTNLNTCGDS AETASTR FEMFS LSGTFGTQ 
YVF PEVLLS ENQLAPGE FQVSTDGRLPSLKPTSGPVLTVTLFGR 
LYE KDWASNAS SGLTAQAR I IML I VI AP I VCS LS W f 


5930 


113 


6082 


rgncfwivpftmaqrtgledperylfvdraviynpatqadwtakH 

klvwipserhgfeaasikeergdevmvelaengkkamvnkddiq 

kmnppkfsjcvedmaeltclneasvlhnlkdryysgliytysglf 

CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I LCTGESGAGKTENTKXVIQYLAH VASSH KGRKDHN 
IPGE\LERQLLQANPILESFGNARTVQNDNSSRFGKPIRINFDV 
TGYI VGAN I E T YLLEKSRAVRQAKDERTFHI F YQLLS G \ AGEHL 
KSDLLLEGFNNYRFLSNGYIP I PGQ\QDKGNPRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDOASMPBNTVAQKL 
CHLLGMNVMEFTRAI LTPRI KVGRDYVQ KAQTKEQADFAVEALA 
KATYERLFRMLVHRINKALDRTKRQGASPIGILDIAGFEIPELN 
S PEQLCINYTNE KLQQLFNHTMF I LEQEE YQREGI EWN P IDFGL 
DLQPCIDL I ERPANP PGVLALLDEE CWFP KATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCI IHYAGKVDYKADEWLMKNMDPLND | 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI I FNHBKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP | 
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seqT 

ID 

NO: 


1 Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

j amino acid 

1 sequence 


rreaiccea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«=Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N»Asparagine, 
P«Proline, Q«Glutamine, R»Arginine, 
S=Serine, T«Threonine, V»Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








NAIPKGFMDGKQACBRMIRALElLDPI^YRiGQSKIPFRAGV^jr" 

LBKKRDLK ITD III FFQAVCRGYLARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPI*LQVTRQEEELQAKDEELLKVK 

EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 

RARLAAKKQE LEE I LHDLESRVEEBEERNQ ILQNEKKKMQAHIQ 

DLEEQLDBEEGARQKLQLEKVTAEAKIKKMEEEILLIiEDQNSKF 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 

RLKKEEKTRQELEKAKRKI*DGETTDLQDQIAELQAQIDELKLQL 

AKKEEELCK5ALARGDDETLHKNNALXVVRELQAQIAELQEDFES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQBLRTKRE 

QEVAELKKALEEETKNHEAQIQDMRORHATAIiEELSEQLEOAKR 

FKANLE KNKQGLrETDNKELACEVKVLQQVKAESEHKRKKLDAQV 

QELHAKVSEGDRLRVEIiAEKASKLQNELDNVSTLLEEAEKKGIK 

FAKDAAS^SQLQDTQELIjQEETRQKLNLSSRIRQLEEEKNSLQ 

EQQEEEEEARKNLEKQVIALQSQLADTKKKVDDDLGTIESLEEA 

KKKLLKDAEALSQRI^EKAXAYDKLEXTKNRLQQELDDLTVDLD 

HQRQ VAS NLEKKQ \ KKFDQLLAEEKS I SARYAEERDRAEAEARE 

KETKALS LARALE EALEAKEEPERQNKQLRADMEDIiMS SKDDVG 

KNVHELEKSKRALEQQV\EEMRTQLEELiEDELQATEDAKLRLEV 

NMQAMKAQFERDLQTRDEQNEEXKRLLIKQVRELEAELEDERKQ 

RALAVAS KKKM E I DLKDLB AQI EAAN KARDE V I KQLRKLQAQMK 

DYQRELEEARA5RDE I FAQS KBSEKKLKSLEAE I LQLQEE LASS 

ERARRHAEQERDELADEITNSASGKSALLDEKRRLEARIAQLEE 

SLEEEQSNMELLJIDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 

QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIGQLEEQLE 

QEAKERAAANKLVRRTEKKLKEIFMQVEDERRKADQYKEQMEKA 

NARMKQLKRQLBEAEEEATRANASRRKLQRELDDATEANEGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLBLSDDDTESK 
TSDVNETQPPQSB 


5931 


113 


6082 


rgncfwivpftmaqrtgledperylfvdraviyiIpAtqadwtak 

KLVWIPSERHGFEAAS IKEERGDEVMVELAEftGKKAMVNKDD IQ 

kmnppkfskvedmaeltclneasvlknlkdryysgliytysglf 

CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDRRDQS I L CTGESGAGKTENTKKVIQ YLAHVASSHKGRKDHN 
I PG E \LBRQLLQANP I LES FGNARTVQNDNS SRFGKFIRINFD V 

tgyivganietylleksravrqakdertfhifyqllsgVagehl 

ksdlllegfnnyrflsngyipipgq\qdkgnfrgdpgeamhimg 

fsh ee ilsmlkwss vlqfgni s fkkerntdqasmpentvaqkl 

chllgmnvwe ftrai ltprikvgrd wqkaqtkeqadfaveala 

katyerlfrwlvhrinkaldrtkrqgasfigildiagfeifeln 

sfeqlcinytneklqqlfnhtmfileqeeyqregiewnfidfgl 

dlqpcidlierpanppgvlalldeecwfpkatdktfveklvqeq 

gshskfqkprqlio^kadfciihyagicvdykadewlmknmdpi^ 

nvatllhqssdrfvaelwkdvdrivgldqvtcmtetafgsaykt 

kkgmfrtvgqlykesltklmatlrntnpnfvrci ipnhekragk 

ldphlvldqlrcngvlegiricrqgfpnrivfqefrqryeiltp 

naipkgfmdgkqacermiraleldpnlyrigqskiffragvlah 

leeerdlkitdiiiffqavcrgylarkafakkqqqlsalkvlqr 

ncaaylkijuwqwwrvftkvkpij^vtrqeeelqakdeellkvk 

ekqtkvegeleembrkkqqllebknilaeqlqaetelfaeaeem 

rarlaakkqeleeilhdlesrveeeeernqilqnekkkmqahiq 

dleeqldeeegarqklqlekvtaeakikkmeeeillledqnskf 

ikekklmedriaecssqlaeeebkaknlakirnkqevmisdlee 

rlkkeektrqelekakrkldgettdlqdqiaelqaqidelklql 

AKKEEELQGAIJUeGDDBTLHKNNALKVVREI,OAQIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRB 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKELACBVKVLQQVKAESEHKRKKLDAQV 
QELHAKVSEGDRLR VELAE KASKLQNELDNVS TLLEEAEKKG 1 K 
FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLBEEKNSLQ 
EQQEEEEEARKNLEKQVIJU.QSQIiADTKICKVDDDLGTIESLEEA 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c orr e s pondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, ^Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, Glycine, 
H=Histidine, I-Isoleucine, K=»Lysine, 
L-Leucine, M*Mcthlonine, N-Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
SaSerine, ^Threonine, V»Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








KKKLLKDAEALSQRLEE KALAYDKLEKTKNRLQQE LDDLTVDLD 
HQRQVASNLE KKQ \ KKFDQLLAEEKS I S AR YAEERDRAEAEARE 
KETKALS liARALEEAliEAKEBFERQNKQLRADMEDLMSS KDDVG 

KNVHRTiRK&K R ATi??fVW\ STCMDTv^r put cnsr j-Mtmmt* m*v v>v . 
iui v ncjjo iu5 jvi^Huisyyv \CiiiPiK iVLiBEJjEDELQATBDAKLRLEV 

NMQAMKAQPERDLQTRDEQNEEKKRIiIjIKQVREIiEAELEDERKQ 
RALAVAS KKKME IDL KDLEAQ I E AANKARDEVTKQLRKLQAQMK 
DYQRELEEARASRDEI FAQSKESEKKLKSLEAEILQLQE3LASS 
ERARRHAEQERDELADE ITNSASGKSALLDEKRRLEARI AQLEE 
ELEEEOSNMELLNDPFP JTTTT.nvnTr .Macr tv a b»d e n TiAvonm t> 

QQLERQNKELKAKLQELEGAVKS KFKATI SALEAKIGQLEEQLE 
QEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
KARMKQLKRQLEEAEEEATRANASRRKLQREI#DDATEANEGLSR 
EVSTLKNRJjRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


SM 


RHLEEICFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FG ATLAVGLT I FVLS WTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYP P P YPAQPMGPPAYHETLAGGAAAP YPASQPP YNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKIJCMADKTPGGSQKASSKTR55DV^5SdSSDAriMD^P6D 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFS IGKMSTAKRTLS KKEQEELKKKEDEKAAABI YEE FLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLBLFKEELKQI 
QEERDERHKTKGRLiSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGKI\NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNIjNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKWPI^PMLPPPKNKEDFEKTLSQAIVKVVIPTERNLLALI 
HRMIEFVVREGPMFEAMIMNREINNPMFRFLFENQrPAHVYYRW 
KLYS ILQGDS PTKWRTEDFRMFKNGS FWRPPPLNPYIiHGMSEEQ 
ETEAFVEEPS KKGALKEEQPJDKLEE I LRGLTPRKNDIGDAMVFC 
LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 
tv.VAWAJ> x xKKrrJSTKtjCQI FSDLNATYRTIQGHIiQSENFKQRVM 
TCFRAWED WAI YPEPFL I KLQNI FLGLVNI IBEKETED VPDDLD 
GAPIEEELDGAPLEDVDG I P IDATP I ODLDGVP I KS LDDDLDG V 
P LDATEDS KKNE P I FKVAPS KWEAVDE S ELEAQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHhXYSNPIKEEMTE 
SKFS KYS EMSEEKRAKLREI ELKVMKFQDELB5GKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTS PS PS RSSS GRRVKS PS PKS E RSERS ER 
SHKESS RSRSSHKDS PRDVS KKAKRS PSGSRTPKRSRRSRSRS P 
KKSGKKSRSQSRSPHRSHKKS KGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRKilQ^KTPGGSQKASSKTRSSDVHSSGSSnAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFS I GKMS TAKRTLS KKEQEELKKKE DE KAAAE 1 YEEFLAAFEG 
SDGNKVKTFVRGG WNAAKEEHETDE KRGKI YKPS SRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QSERDERHKTKGRIiSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 
IJ^VKIMWPRTDEERAREraCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPENAQP 
RERLKNPNAP^PPPKNKEDFEKTI^QAIVKOTIPTERNtiALI 
HRMIEFVVREGPMFEAMIMNREINNPMFRFriFBNQTPAHVYYRW 
KLYS ILQGDS PTKWRTEDFRMFKNGS FWRP PPLNP YLHGMS EEQ 
ETEAFVE E PS KKGALKE EQRDKLEE I LRGLT P RXND IGDAMVFC 
LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSEKTFKQRVM 
TCFRAWED WAI YPEPFLIKLQNI FLGLVNI IEEKETEDVPDDLD 
GAP I EE E LDG APLED VDG I P I DAT P I DDLDG VP IKS LDDDLDG V 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F«Phenyl alanine, GoGlycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
LsLeucine, M°Mebhionine, N-Asparagine, 
P*Proline, Q-Glut amine, R=Arginine, 
S=»Serine, T=Threonine, VnValine, 
W=sTryptophan, Y=Tyxosine, X=Un known , *=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








PLDATEDS KKNE P I FKVAP SKWEAVDBSELEAQAVTTS KWELFD 
QHEESEEEENQNQEEESBDBEDTQSSKSEEHHLYSNPIKEEMTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSP 
QEQ VEI I YRDKLLQRE KE KELERERERDKKDKE KLES RS KDKKEK 
DE CTPTRKERKRRHSTS P SPS RSSSGRRVKS PS PKS ERS ERSER 
SHKESSRS RS SHKDS PRDVSKKAKRSPSGSRT PKRSRRS RSRS P 
KKSGKKSRSQSRSPHRSHKKSKGKTNPTGRKFFKKAVTYWKCDLP 
LCPERSVF 


5935 


3 


"4493 

• 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 
SDSGSFVSSRARREKKSKKGRQEAIiERLKKAKAGERYKYEVEDF 
TGVYEEVDEEQYSKLVQARQDDDWI VDDDG IGYVEDGREI FDDD 
LEDDALDADE KG KDGKARNKDKRNVKKXiAVTKPNN I KSMF I ACA 
GKKTADKAVDLS KDGLLGDIIiQDLNTETPQI T P PPVMILXKKRS 
IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAG 
DD VQVESTEEEQES GAME FED GDFDEPMEVEEVDLEPMAAKAWD 
KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQVDS SHLPLVKGADE EQVFHFYWLDAYEDQ YNQ PG WFLF 
CKVWI ES AETHVS CCVM VKNI ERTL YFL PREMK I DLNTG KE TGT 
PISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPBKS 
EYLEVKYSAEMPQLPQDLKGETFSHVFGTNTSSLELFLMMRKIK 
GPCWliEVKKSTAI^QPVSWCKVEAMALKPDLVNVIKDVSPPPLV 
VMAFSM KTMQNAKNHQNE I IAMAALVHHSFALDKAAPKPPFQSH 
FCWSKPKDCIFPYAFKEVIEKKNVKVEVAATERT1XGFFLAKV 
HKIDPDIIVGMNIYGFELEVLLQRINVCKAPHWSKIGRLKRSNM 
PKLGGRSGFGERNATCGRMICDVEISAKEIilRCKSYHLSELVQQ 
ILKTERWIPMENIQNMYSESSQLLYLLEHTWKDA\KFILQIMC 
ELNVLPLAXiQ I TN I AGNI MSRTLMGGRSERNE FLLLHAFYENN Y 
I VPDKQ I FRKP QQKLGDEDEE IDGDTNKYKKGRKKGAYAGGLVL 
DPKVGFYDKFI LLLDFNS LYPS I IQEFNICFTTVQRVASEAQKV 
TEDGEQ3QIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 
LN P DL I LQ YD I RQ KAL KLTANSM YGCLG FS YS RFYAKPLAALVT 
YKGRE II^IHTKEWVQKMNLEVI YGDTDS IM INTNSTNLEBVFKL 

GNYVTKQELKGLDIVRRDWCDIiAKDTGKFVIGQXLSDQSRDTtV 
ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 
HVHVALW INS QGGRKVKAGDTVSYVI CQDGSNLTAS QRAYAPE Q 
LQKQDNLTIDTQYYIAQQIHPWARICEPIDGIDAVLIATGWEL 
\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 
TCGTENI YDNVFDGSGTDMEPSLYRCS WIDCKAS PLTFTVQLSN 
KLIMDIRRFIKKYYDGWLICEEPTCRNRTRHLPLQFSRTGPLCP 
ACMKATLQPEYSDKSLYTQLCFYRYIFDAECALEKLTTDHEKDK 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KLFAGCAV 
KS 


5936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHRSRAWTCYLAI 
RMLMATCCP5PTTTACTGPWQRAPPLRLLVQKREADSSGLAFAS 
NSLQRRKKGLLLRPVAPIiRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPLGFYIRDGMSVRVAPQG\LERVPGIFI 
SRLVRGGliAESTGLLAVSDE I LE VNGIEVAGKTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPP3AGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


PTS LLKST VQLMCRLLQDKRYQCVYSLAEI FKVLASFYVl LVI L " 
YGLTSSYSLWWMLRSSLKQYS FEALREKSNYSDIPDVKNDFAFI 
LHLADQYDPLYSXRFS I FLSEVSENKLKQINLNNEWTVEKLKSK 
LVKNAQDKIELHLFMLNGLPDNVFBLTBMEVLSLEIilPEVKLPS 
AVSQLVNLKE LRVYHS S LWDHPALAFLEENLKI LRL KF TEMG K 
I PRWVFHL KNL KEL YL SGCVLPEQLSTMQLEG FQDLKNLRTLYL 
KSSLSRIPQWTDLLPSLQKLSLDNEX3SKLVVLNNLKKMVNLJCS 
LELI SCDLERI PHS IFSLNNLHELDLRBKNLKTVEBI XSFQHLQ 
NLSCLKLWHNNIAYIPAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKLHYLDLSYNHLTFIPEEIQYL\SNLQYFAVTNNNIfiMLPDG 
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amino acid 
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amino acid 
sequence 


Predicted end 
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location 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E* 
Glutamic Acid, FsPhenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, MsMethionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V-Valine, 
W-Tryptophan, Y«Tyrosine, X= Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLLLGKNSLMNLS PHVGELSNLTHREPIG \W YLETL " 
PPBLBGCQSLKRNCLIVEENLLNTLPLPVTERLQTCLDKC 


5938 


395 


1865 


YKGEGFFCNQEARGERbKKKKAMS S PN I WSTGSS VYSTPVFSQK 
MT VW I LLLLS L YPGFTSQKSDDD YED YASNKTWVLTP KVPEGD V 
TVU^NNLLEGYDNKIiRPDIGVKPrLIHTDMYVNSlGPVNAINME 
YTIDIFFAQTWYDRRLKFNSTIKVLRLNSNMVGKI WI PDTFFRN 
SKKADAHWITTPNRMkRIWNDGRVLYSLRLTI DAECQIiQLHNFP 
MDEHS CPIiEFSS YG Y PR EEI VYQWKRSS VEVGDTRS WRLYQFS F 
VGLRNTTBWKTTSGDYWMSVYFDLSRRMGYFTIQTYIPCTLI 
WLSWVSFW1NKDAVPARTSLG1TTVLTMTTLSTIARKSLPKVS 
YVTAMDL F VS VCFI FVFSALVE YG \TLHYFVSNRKPS KDKDKKK 
KNPAPTID IRPRSAT IQMNNATHLQERDEEYGYECLDGKDCASF 
FCCFEDCRTGAWRHGRIHIRIAKMDS YARI FFPTAFCLFNLVYW 
VSYLYL 


5339 




1404 


I RPG YLKEVQENS PGH RAGLE P F FDF I VM I NgSRIjNKDMDTLKD 
LLKANVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGLLGVSIR 

FCS fdganenvwhvlbvesnspaalaglrphsdyi igadtvmne 

SBDLFS L I E THEAKPLKL Y VYNTDTDNCREV 1 1 T PNSAWGGEGS 
LGCGIGYGYLIIRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTE 
VQLSSVNPPS LS PPGTTGIEQSLTGLSISSTP \ PAVSSVLSTGV 
PTVP\LLPPQVNQSLT5VPPMESSYLHLPGLMPFTRQGLPNLPQ 
PSTFNLPR\PTHSWPGVGbYQEFVKPGVLPPLSSMPPRNLPG\I 

aplplpseflpsfplvpesssaassgellsslpptsnapsdpat 
ttakadaassltvdvtpptakapttvedrvgdstpvsekpvsaa 

VDANASESP 


5940 


145 


717 


rrsasrsasprqsagtavttgtraggtclaaahhrmrwradgrs 

LEKLPVHMGLVITEVEQEPSFSDIASLWWCMAVGISYISVYDH 
QGIFKRNNSRLMDEILKQQQELLGLDCSKYSPEFANSNDKDDQV 
LNCHLAVKVLS PEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYltVQMWLILI 


5941 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGtpFCCGGSI^VVVtLALPVA 

WGQCNAPEW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

S 1 1 CI*KNS VWTGAKDRCRRKS CRNPPDPVNGKVHVI KGIQ FGS Q 

IKYSCTKGYRLIGSSSATCIISGDTVIWDNETPICDRIPCGLPP 

TI TNGDFI S TNRENFHYGS WT YRCNPGSGGRKVFEL VGEPS I Y 

CTSNDDQVGIWSGPAPQCHPNKCTPPNVENGILVSDNRSIiFSL 

NEWEFRCQPGFVMKGPRRVKCQALNKWEPELPSCSRVCQPPPD 

VlflAERTQRDKDNFSPGQEVFYSCTPGYDLPJSAASMRCTPQGDW 

S PAAPTCEVKSCDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEGF 

QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 

KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VtfSS PAPRCGlLGHCQAPDHFIiFAKLKTQTNASDFP IGTSLKYE 

CRPE Y YGRPFS ITCLDNLVWS S PKDVCKR^CSCKTPPDPVNGM VH 

VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPI 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FELVGEPS I YCTSNDDQVGI WSGPAPQCI IPMKCTPPNVENGIL 

VSDNRSLFSLNBWEFRCQPGFVMKGPRRVKCQALNKWEPELPS 

CSRVCQPPPDVLHAERTQRDKDKFSPGQEVFYSCEPGYDURGAA 

SMRCTPQGDWSPAAPTCEVKSCDDFMGQLLNGRVLFPVNIiQLGA 

KVDFVCDEGFQLKGSSASYC^IjAGMESLWNSSVPVCEQIFCPSP 

PVIPNGRHTGKPLEVFPFGKAVNYTCDPHPDRGTSFDHGEST1 

RCTSDPQGNGVWSS PAPRCGI LGHCQAPDH FLFAKLKTQTNAS D 

FPIGTSLKYECRPE Y YGRPFS I T CLDNLVW S S PKDVCKRKS CKT 

PPD P VNGMVHVTTD I QVGSR INYSCTTGHRLIGHSSAE C I LSGN 

TAHWSTKPPICQRIPCGLPPTIANGDFISTNRENFHYGSWTYR 

CNLGSRGRKVFELVGEPSIYCTSNDDQVG I WSGPAPQCI I PNKC 

TPPNVENGILVSDNRSIjFSLNEVVBFRCQPGFVMKGPRRVKCQA 

LNKWEPBIiPSCSRVCQP PPEIIiHGEHTPSHQDNFSPGQEVF YS C 

E PGYDLRGAASLHCTPQGDWSPEAPRCAVKSCDDFLGQIjPHGRV 

L FPLNLQLGAKVS FVCDEGFRLKGSS V5HCVLVGMRSLWNNS VP 

VCEHIFCPNPPAILNGRHTGTPSGDI PYGKE IS YTCDPHPDRGM 
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SEQ 
ID 
NO: 



5942 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



4509 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



688 



Amino acid segment containing signal peptide 
<A*Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G»Glycine, 
H«=Histidine, lelsoleueine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q»Glutamine , R«Arginine, 
S«Serine, T«Threonine, V=Valine, 
W-Tryptophan, Y=»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 
TFNLIGESTIRCTSDPHGNGVWSSPAPRCEL^VRAGHCKtPEQP - 



' nwwtnf n^Bua ¥(U\Un^.AirAUr 

P FAS PTI P I NDF E FP VGTS LNYECRPGYFGKM FS I S CLENLVWS 
S VEDNCRRKS CG PP P E PFNGMVH INTDTQFGSTVN YSCNEGFRL 
IGSPSTTCLVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSNN 
RTS FHNGTWTYQCHTGPDGEQLFELVGERS I YCTS KDDQVGVW 
SSPPPRCISTNKCTAPBVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMV GSHTVQCQTNGRWGPKLPHCSRVOQ P P PE I LHGEHTL SHQ 
DNFSPGQEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 
CDD FLGQL PHGR VLI* PLNLQLGAKVS FVCDEGFRUCGRS AS HCV 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VSYTCDPHPDRGMTFNLIGESTIRRTSEPHGNGVWSSPAPRCEL 
PVGAACPHPPKIQNGHYlGGHVSLYLPGrfTISYTCDPGYLLVGK 
GF 1 FCTDQG I WS QLDHYCKE VNCS FPLFMNG I S KE LEMKKVYH Y 
GDYVTIjKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDAI,I 
VGTL5GTIFFI LLI IFLSWI I LKHR KGNNAHE N P KEVA I HLHSQ 

ggssvhprtlqtnbensrvlp 

yly^mranpiaVgIshi^yqidpplVrkhreqVlvibVvgrkl" 
dk\aqmirfeertgyfsstdlgrtashyyikyntietfnelfda 
hktegdi fai vs kaeefdq i kvreee ieeldtllsnfcels tpg 

GVENSYGKIN1LLQTYINRGEMDSFSLISDSAYVAQNAARIVRA 

LFEIALRKRWPTMTYRLLNLSKAIDKRLWGWASPLRQFSILPPH 

MLTRLEEKKLTVDKLKDMRKDEIGHIIJIHVNIGLKVKQCVHQIP 

SVMMEAFIQP ITRTVLRVTLS I YADFTWNDQVHGTVGEPWWIWV 

EDPTNDHI YHSE YFLALKKQVISKEAQIiIiVFTI P I FEPLPSQ YY 

IRAVSDRWLGAEAVCIINFQHLILPERHPPHTELLDLQPLPITA 

LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSGKT 

VAAELAI FRVFNKYPTSKAVYIAPLKALVRERMDDWKVRIEEKL 

GKKVI ELTGDVTPDMKS 1AKADLIVTTPEKWEGVS RS WQNRNYV 

QQVTILIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 

LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 

HYCPRMASMNKPAFQAIRSHSPAKPVLI FVSSRRQTRLTALELI 

AFLATEEDPKQWLNMDERBMBNI IATVRDSNLKLTLAFGIGMHH 

AGLHERDRKTVEBLFVNCKVQVLIATSTIAWGVNFPAHLVI I KG 

TEYYTOKTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 

KKDFYKKFIiYEPFPVESSLLGVLSDHLNAEIAGGTITSKQDALD 

YITWTYFFRRLIMNPSYYNLGDVSHDSVNKFr^HLIEKSLIELE 

LS YCIE IGEDNRS IEPLTYGRIASYYYIiKHQTVKMFKDRLKPEC 

S TSELLS I LSDAEEYTDLPVRHNEDHMNS EIiAKCL P I ESNPH S F 

DS PHTKAIJLLIflAHLSRAMLP CPDYDTDTKTVLDQALRVOQAMI* 

DVAANQGWLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENHHL 

HLFKKWKPIMKGPHARGRTS1ECLPELIHACGGKDHVFSSMVES 

ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNELSVST 

L7ADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 

PRFPKSKDEGWFLILGEVDKRELIALKRVGYIRNHHVASLSFYT 

PE IPGRYI YTLYFMSDCYLGLDQQYD/NLSQRYTSES FCTGQHQ 

GL 



2274 



DKPTRHKTYLSSSWAKMAAAEGPVGDGE LW^TWLPNriWFtkLR ' 
EGLKNQS PTE AEKPAS SSLPS S P P PQLLTRNWFGLGG ELFL WD 
GEDSS FLWRl»RGPSGGG\ EE PALSQ YQRLLCINPPLFE I YQVL 
LS PTQHHVAL I GI KGLMVLELPKRWGKNSE FEGGKS TVNCSTTP 
VAER FFTSSTS LTLKHAAWYPSE I LDPHWLLTSDNVIRI YSLR 
EPQTPTNVIILSEABEESLVLNKGRAYTASLGETAVAFDFGPLA 
AVP KTLFGQNG KDEWAYPLYILYENGETFLTY I S L LHS PGN / 1 
WKAVGS IAHAS \ AAEDNYG YDACAVLCLPCVPN I L VIATESGML 
YHCVVLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVBLELALKL 
ASGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
HKJLHKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLPCRQPAP 
IRGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCXR 
EDVEVAESPLRVLABTPDSFEKHIRSILQRSVANPAFLKASEKD 
IAPPPEECLQLLSRATQVFREQYILKQDLAKBE1QRRVKLLCDQ 
KKKQLBDLS YCREERKSLREMAERIiADKYEEAKEKQED IMNRMK 



408 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, OCysteine, D=Aspartic Acid, B» 
Glutamic Acid, F« Phenylalanine, G»Glycine, 
H«Histidine, Iaisoleucine, KoLysine, 
Ls=Leucine, M=Methionine, N^Asparagine, 
P-Proline, Q=Glutamine, R=»Ar>ginine, 
SaSerine, T=Threonine, VaValine, 
W=Tryptophan, Y«Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHS FHSEL PVLS bSERbMKKELQL X PDQLRHLGNA I KQVTMK 
KDYQQQKMEKVLSLPKPT I ILSAYQRKCIQS ILKEEGEHI REMV 
KQINDIRNHVNF 


5944 


167 


342B 


PSIATFTDEPEVLTEPPSATTTTTIGISATWTTlAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIPDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
KKQ PS VLVTFPKEERKS VSGKAS I KLSETISEGTSNSLSTCTKS 
GPS PLS SPNGKLTVASPKRGQKREEGWKBWRRSKKVSVPSTVI 
SR V IGRGGCN INA1 R EFTGAH I DI DKQKDKTGDR 1 1 TIRGGTES 
TROATQLINALI KDPDKE I DELIPKNRLKSSSANSKIGS SAPTT 
TAANTSLMGIKMTTVALS STS QTATALTVPAIS SASTHKTIKNP 
VN\NVRPGPPVS FP \ LAYP P PQFAHALLAAQTFQQIRPPRLPMT 
HFGGTFPPAQSTWGPFPVRPLS PARATNS PKPHMVPRHSNQNSS 
GSQVNSAGSLTSSPTTTTSSSASTVPGTSTNGS PSS PSVRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNAT YPMPTAK EHYP VS S ? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 

E VRMTVP P L»ATS S A P VA VP Q T A D V*T V D M on TT>M (va t\ or n vm c»rr» 

PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACQNS VHPANKP I APNFSAPL PFGPFS TLFENS PT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FRPPLQRPAPSPSG I VNMDS PYGSVTPSSTHLGNFASNISGGQM 
YG PGA PLGGAPAAANFNRQH FS PLSLLTP CS SASNDS SAQSVSS 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHOAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYGNAMI PPVAPI PDGAGGPI FNGPHAADPS WNSL3 K 
MVS SSTENNGPQTVWTGPWAPHMNSVHMNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRNG I AKDLKGOADFFFLLVS EA WATGSPRA 
WLTCLI LPLPGI I FS VLPKAMSRPLLI TFTPATDPSDLWKDGQQ 
QPQPEKPESTLDaAAARAFYBALIGDESSAPDSQRSQTEPARER 
KRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAA 
Q2GDLPELRRLLEPHEAGGAGGNINAPJDAFWWTPLMCAARAGQG 
AAVSYLLGRGAAWVGVCELSGRDAAQIiAEEAGFPEVARMVRESH 
OETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAKLLSLSQGP 
QPPNLPLGVPISSPGFKLLIiRGGWEPGMGIiGPRGEGRANPIPTV 
LKRDQEGLG YRSAPQ PRVTHFP AWDTRAVAGRE \TPPRVATLSW 
RE ERRREE \ KDRAWE RDLRTYMNLE F 


5346 


541 


1666 


ILGSYSSIQPEEYS \SWC\EWLQDLLA\YVSPK\HSYLRDLP 
SEGSPQRVNSIDFV\ EL\EHLQPDVLVHAVLRWDF/TI LTEAV 
YS YRG QKQKKVMLTVEQAQDQHYAL VL WG PGAAW \ Y PQLQRKKG 
YIWEFKYLFVQCNYTLENXiELHTTPWSSCBCLFDDEIRAITFKA 
KFQKSAPS FVKI SDLATHLEDKCSGWL IKAQ ISELAFP ITASQ 
KIALNAHS SLKS I FS SLPNI VYTG CAKCG LEI iE TDENR I YKQCF 
SCLPF1MKKIYYRPALMTAIDGRHDVCIRVESKL1EKILLNISA 
DCLNRVIVPSSEITYGMVVADLFHSLLAVSAEPCVl^IQSLFVL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


594^7 


3 


1317 


RG I PDRRRRGP IGRVNMDLENKVKKMGLGHEQGFGAPCLKCKEK " 
CEGFELHF WRKI CRNC \NVAKKSM/ TVLLSNE EDRKVGKLF3DT 
KYTTL I AKLKS DG I PM YKRNVM I LTNP VAAKKNVS INTVTYEWA 
PPVQNQAIARQYMOMLPKEKQPVAGSEGAQYRKKQLAKQLPAHD 
Q D PS KCHELS PRE VKEMEQ FVKKYKS EALG VGD VKLPCEMDAQ G 
PKQMNI PGGDRSTPAAVGAMBDKSAEHKRTQYSCYCCKLSMKEG 
D PAI YAERAG YDKLWHPACF VCSTCHELLVDMI Y FWKNBKL YCG 
RHYCDS EKPRCAGCDEL I FSNEYTQAENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKP VCKPCYVKNHAVVCQGCHNAIDPEVQRVTYN 
NFS WHASTECFLCSCCS KCLIGQKFMP VEGMVFCS VECKKRMS 


" "5948 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPlDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEIE1EGRLHRXS I FDPLE I ILEDDLTAQEMSECNSNKENSERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTP ASASALPE PKVR I VEY 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acxa segment containing signal pepticTen 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E» 
Glutamic Acid, Phenylalanine, G«Glycine, j 
H=Histidine, I=>Isoleucine, K«Lysine, j 
L=Leucine, M«Methionine, N=Asparagine, ! 
P=Proline, Q-Glutamine, R*Arginine, 
S«Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, j 
\=possible nucleotide insertion) \ 


5949 ■ 






S PPS APRRP P V YYKF I E KSAE ELDNE VE YDMDEED YAWLE I VNB 
KRKGDCVPAVSQSMFEPLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCI CMDGECQNSNVILPCDMCNLAVHQECYGVPYI PEGQWLC / 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ I P 
E\VGFANTVFIEPIDGVRNIPPARWKLT\CNLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTP PGCTRRPLN I YGDVBMKNGVCRKESS VKTVRS TS KVR 
KKAXKAKKALAE PCAVLPTVCAPYI PPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 
MBLRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KHPMDFATMRXRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDOGGWLRQARREVDS I GLE EASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQpRKRSRSTCGDSEVEEESPGKRIiDAGI* 
TNGFGGARSEQEPGGGLGRKATPRRRCASESS ISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSEL ISCIENGNYAKAAR T AAEV 
GQSSMWISTDAAASVLBPLKWWAKCSGYPS YPAL1 IDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMOTKSDEKLFLVLFFDNKRSWQ 
WL PKS KMVPLG IDETIDKLKMMEGRNS S IR KAVR IAFDRAMNHL 
SRVHGEPTSDLSD I D 


5950 


39 


3370 


yrbryp^ggsvlrsalbVcWdflsgltbgsllpegffsgfidqH 

GNHYQMRRKGRCHRGSAAP^PSSPCSVKHSPTRETLTYAQAQ^M 

VEIEIEGRLHRISIFDPLBI I LEDDLTAQEMS ECNSNKENS ERP 

PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 

SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWIiEIVNE 

KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEnA 

VCCI CMDGECQNSNVILFCDMCNLAVHQECYGVP Y IPEGQWLC / 

RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\VCALW\IP 

E\VGFANTVFIEPIDGVRNIPPARWKLT\CNLCKEKGR/VGACI 

QCHKANOfTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 

YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 

KKAKKAKKALAEPCAVLPTVCAPYIPPQRLNRIANQVA1QRKKQ 

FVERAHSYWIiLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 

KAAKEKLKYWQRLRHDIjERARLLIELLRKREKLKREQVKVEQVA 

MELRLTPLTVLLP^VLDQLQDKDPARIFAQPVSLKEVPDYLDIII 

KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 

FYRAAVRLRDQGGWLRQARREVDS IGLEEAS GMHLPERPAAAP 

RRPFSWBDVDRIiDPANRAHIXSLEEQLRELLDMLDLTCAMKSSG 

SRS KRAKLLKKEI ALLRNKLSQQHSQPL PTGPGLEG FEEDGAAL 

GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 

TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 

FNAPKCXjRGKPALVRRHTLEDRSELISCI ENGNYAKAARI AABV 

GQSSMWISTDAAASVLEPLKVVWAKCSGYPSYPALIIDPKMPRV 

PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 

WLPKSKMVPLGIDETIDKLKMMEGRNSSIRKAVRIAFDRAMNHL 

SRVHGEPTSDLSDID * I 






1166 


373 


ESRSLTMSTSQPGACPCQGAASRPAI LYALLSSSLKAVPRPRSR 
CLCRQHRPVQLCAPHRTCREAIiDVLAKTVAFLRNLPSFWQLPPQ 
DQRRLLQGaiGPIiFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLES FWSLELSPKE \ YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPMCPAAQGR 
LTRVLLTASTLKSIPTSLLGDLFFRPI IGDVDXAGLLGDMLLLR 




5951 


143 


^449 

_ [j 


WNVKPSLiWQLFKFSDKEEHEQNDSlSGKTGElGVBEMlArRiH 

VEQDSKETVKLSHEDDHlLEDAGSSDISSDAACrNPNKTENSLV 

SLPSCVDEVTECNLELKDTMGIADKTENTLBRNKIEPIiGYCEDA 

ESNRQLESTE FNKSMLE WDTST FGPE SNILENAI CDVPDQNTS K 

DLNAIESTKIESHETANLQDDRNSQSSSVSYLESKSVKSKHTKP 

/IHS KQNMTTDAP KKTVAAKYEV1HS KTKVNVKS VKRNTDVPES 

2QNFHRP\nCVRKKQIDKEPKIQS(^SGVKSVKNQ^HSVLKKTLQ 
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sequence 


Amino acid segment containing signal peptide"! 
<A=Alanine, OCysteine; D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G-Glycine, .' 
H=Histidina, I=Isoleucine, K»Lysine, 
L -Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q«Glutamine, R^Arginine, 
S=Serine, T=Threonine, VuValine, 
W»Tryptophan, Y=Tyrosine, X*Unknovra, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion) 


5952 






DQTLVQIPKPb'I-HSLSDKSHAHPGCLKEPHHPAQTGHVSHSSQK 

QOTKPQQQAPAMKTNSHVKEELEHPGVEHFKEEDKLKLKKPEKN 

LQPRQRRSSKSFSLDBPPLPIPDNIATIRREGSDHSSSFESKYM 

WTPSKQCGFCKKPHGNRPMVGCGRCDDWFHGDCVGLSLSQAQQM 

GEEDKE YVC VKCCAEEDKKTE I LDPDTLENQATVE FHSGDKTME 

CEKLGLS KHTTNDRTKYI DDT VKHICVKI LKRESGEGRNS SDCRD 

NE I KKWQLAPLRKMGQ ? VLPRRS SEEKSE KI PKESTT VTCTGE K 

ASKPGTHEKQEMKKKKV\EKGVLNVHPAASASKPSADQIRQSVR 

HSLKDILMKRLTDSNLKVPEEKAAKVATJCIEKELFSFFRDTDAK 

YKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMSPEELAS 

KELAAWRRRENRHTIEMIEKEQREVERRPITKITHKGEIEIESD 

APMKEQEAAMEIQEPAANKSLEKPEGSEK\RKEEVDSMSKDTTS 

QHRQHLFDLNCKICIGRMAPPVDDLSPKKVJCVWGVARKHSDNE 

AESIADALSSTSNILASEFFEEEKQESPKSTPSPAPRPEMPGTV 

EVESTFLARLNFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 

PDS I Q VGGRI S PQTVWDYVE KI KASGT KE I CWRFTP VTEEDQ I 

S YTLLFAYFSSRKR YGVAANNMXQVKDMYLI PLGATDKI °HPLV 

PFDG PGLELHRPNLLLGLI IRQKLKRQHSACASTSH I AETPES A 

PP IALPPDKKSKlEVSTEEAPEEENDFFNS FTTVLHKQRNKPQQ 

NLQEDLPTAVEPLMBVTKQEPPKPLRFLPGVLIGWENOPTTLEL 

ANKPLPVDDII^SLLGTTGQVYDQ\AQSVMEQNTVKBIPFLNEQ 

TNSKIEKTDNVEVTDGENKEIKVKVDNISESTDKSAEIETSVVG 

S5S I SAGSLTSLS LRGKPPDVSTEAFLTNLS IQSKQEETVESKE 

KTLKRQLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 

LVANTARS PQFINLKRD PRQAAGRS QPVTTS ESKDGDSCRNGEK 

HMLPGLSHNKEHLTEQ INVEEKLCS AEKNSCVQQSDNLKVAQNS 

PS VENI QTSQAEQAKPLQED ILMQNI ET VHPFRRGSAVATSHFB 

VGNTCPSEFPS KS ITFTSRSTS PRTSTNFS PMRPQQPHLQHLKS 

SPPGFPFPGPPNFPPQSMFGFPPHI,PPPLLPPPGFG\FA\QNPM 

VPWPPW\HLP\GQPQR1^3GPLSQASRYIGPQNFYQVKD T RRPE 

RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 

WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 

GKASRDSRNVDKXPDKPKSEDYEKDKEREKS KHREGEKDRDRYH 

KDRDHTDRTKSKR 




3226 


639 


PPARR3ARDliPRAL3MEAARPSGSWNGALCRLL\LVTti\AFLIF 
ASI1ACKNVTLHVPSKLDAEKLVGRVNLKECFTAANL IHS SDPDF 
Q I LEDGS VYTTNTI LLS S EKRSFTI LLSNTENQE KKKI F VFLEH 
OTKVLKKRHTKEKVUU^RRWAPIPCSMLENSLGPFPLFIO^V 
QSDTAQNYTIYYSIRGPGVDQEPRNLFYVERDTG2TLYCTRPVDR 
EQ YES FE I IAFATTPDG YTPELPLPL 1 1 KI EDENDNYP I FTE ET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYS I IGQVPPS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLKIKVQDMDGQYFGL 
QTTSTCI INIDDVNDHLPTFTRTS YVTS VEENTVDVEILRVTVB 
DKDLVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCVVKPL 
NYEBKQQMILQIGVVNEAPFSREASPRSAMSTATVTVNVBDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTIDENTGSIKVFRSLDREAETIKNGIYNITVLASDQG 
GRTCTGTLGI I LQDVNDNS PF I PKKTVI I CKPTMSS AE I VAVDP 
0EPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
GSYWPITVRDRI^MSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVQLGKWAILAILIjGIALFFCI LFTLVCGASGTSKQPKVI PDD 
i-LTvw iv uivoHi n/\irvjLiiJJvV I o ASGr TTQTVGASAQG VCGTVGSG 

IKNGGQETIEMVKGGHQTSESCRGAGHHHTLDSCRGGHTEVDNC 
RYTYSEWHSFTQPRLGEESIRGHTLIKN 


5953 
—5954 


330 r 


811 

■ 


PJ^CNPDPGWYWWVKQESEISKESQEMDARPKLDLGFKEGQTIK 
bCIGNITNKKGGASKPRTARGGGLSLLPPPPGGKVTIPPPSS /V 
KLPSTNHVTPPSIPKSNHGGSDADILLDIjDSPAPVTTPAPTPVS 
iTSNDLWGDFSTASSSVPNOAPQPSNWVQF " j 




32 


2130 

1 
J 


PPPPPPKIJ^MADLFAVl^VSYLMAMEKSKATPAA^iailVL'^ 
PEPSIRSVMQKYLAfiRNEITFDXIFNQKIGFLLFKDFCLNEINE 
^VPQVKF YE BI KE YEKLDNBEDRLCRSRQI YDAYIMKELLS CSH 
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Amino acid segment containing signal peptide 
< At " Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G«Glycine, 
HaHistidine, I=Isoleucine, KoLysine, 
L=I,eucine, M«Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T«Threonine, V»Valine, 
W-Tryptophan, Y«Tyrosine, X»Unknown, **Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFS KQAVEHVQSHLSKKQVTSTIjFQPYI BE 1 ckslrGd t fqkPm" 
ESDKFTRFCQWKNVBIiNIHLTMNEFSVHR I XGRGGFGEVYGCRK 

adtgkmyamkclnkkrikmkqgetlalnerimlslvstgdcpfi 
vcmtyafht pdklcf i ldlmnggdlhyhls qhgvfs ekemrfya 

TEIILGLEHMHNRFWYRDLKPANILLDEHGHARIS\DLGLACD 
F 9 KKKPHAS VGTHGYMAP E VLQKGTAYDS S ADWFSLG CMLF KLI> 
RGHSPFRQHKTKDKKEIDRMTLTVNVELPDTFSPELKSLLBGIiL 
CRDVSKRIiGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLlPP 
RGEVNAADAFD IGSFDEEDTKG I KLLDCDQEL YKNFPLVI S ERW 
QQEVTETVYEAVNADTDKI EARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKWSNPFIiTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILS VBETQI KDKKCILFRI KGGKQFVLQCBSDPEFVQWKKELNB 
TFKEAQRLLRRAPKFLNK PRSGTVELPKPS LCHRNSNGL 


5955 


1726 


444 


KREREFRIAVCPLRYPSAYESSPGTELRECGLCRSGQBFADCRR ' 
PANRQDVLSG W INLPVLQLTKDPLKTPGRIiDHGTRTAF I HHREQ 
VWKRC INI WRD VGLFG VLNE I ANS EEEVFE W VKTASGWALAL CR 
WAS SLHGS LFPHLS LRSE DLIAEFAQVTNWS S CCLRVFAWHPHT 
NKFAVALLDDS VRVYNAS STI VPS L KHRLQRNVAS LAWKPLS AS 
VLAVACQSC ILIWTLDPTSLSTRPS SGCAQVLSHPGHTPVTSLA 
WAPSGGRI^LSASPVDAAlRVWDVSTETCVPLPWFRGGGVTNUiW 
SPDGSKILATTPSAVFTIVWEAQMWTCERWPTLSGRCQTGCWS PD 
GSRLLFTVLGE PL I YSLS FP EROGEGKG\ ALB VQSQQRLWQ I CL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


1705 


139 


GVGVRGARAMAT VQEKAAAIiNLSALriS PAHRPPGP'SVAQKP FGA 
TYVWSSIINTIiO j rOVEVKKRRHRIjiaiH}JTy^PVQQT?a\mvTT?cs , arT 
I QNKYFGDVDI PRAKWRVCQALMDY KVFEAVPTKVFGKD KKPT 
FEDSSCSLYRFTTIPNQDSQLGKENKLYSPARYADALFKSSDIR 
S AS LEDLWENLSLKPANS PHVNI SATLS PQVI NEVWQEET I GRL 
LQLVDLPLLDSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGILK 
AYSDS QEDEWLSAAIDCSE YLPDQM WB I SRS FPEQPDRTDLVK 
ELLFDAI GRYYSSREPLLNHLSDVHNGI AELLVNGKTE IALEAT 
QLLLKUiDFQNREEraRLLYFmVAANPSEFKLQKESDNRMVVK 
R I FS KAI VDNKNLS XGKTDLLVLFL \MDHQKDVFKI PGTL \HKI 
VS\VK\ LMAIQNGRDPNRDAGYI YOQRIDQRDYSNNTEKTTKDE 
LLNLL KTLDEDS KLS AKEKKK\ LLG QFYKCHPDI F I EHFGD 


5957 


1479 


451 


EWVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENI KNAMLI K " 

GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 

SKKSDCSLFMFGSHNKKRPNNLVIGRMYDYHVIiDMIELGIBNFV 

SLKDIKNS KCPEGTKPMLI FAGDDFDVTEDYRRLKS LLIDFFRG 

PTVSNIRLAGLBYVLHFTALNGKIYFRSYKLLLKKSGCRTPRIB 

LEE^PSI^LVLRRTHLASDDLYKLSMKMPKALKPKKKKNISHD 

T?GTTYGRIHMQKQDI»SKLQTRKM\KGLKKRPAERIT3DHBKJCS 

KRIKKKLMBLSQPLLFHCVLLKRIIKHQSIQSFL 


5958 


1 


3138 


AAAUSMI^WFPACQAFNLDVEKLTVYSGPKGSYFGYAVDFHiPD 
ARTASVLVGAPKANTSQPDIVEGGAVYYCPWPAEGSAQCRQIPF 
DTTNNRKI RVNGTKE P I EFKSNQWFG\ ATVKA\HKG KSCGPVAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGQGYCQAG FS LDF YKNGDLI VGG PGS F YWQGQVI TASVAD I IA 
NYS FKDI LRXLAGBKQTEVAPAS YDDS YLG YS VAAGEFTGDSQO 
ELVAGIPRGAQNFGYVS IINSYDMTFIQNFTGEQMASYFGYTW 
VS DVNS DG LDD VLVGAP L FM ERE FESN P R E VGQ I YL YLQ VSSLJb 
FRDPQ I LTGTETFGRFGS AMAHLGDLNQDGYNDIAI G VP FAGKD 
QRGKVL I YNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SDIDKNDYPDLrVGAFGTGKVAVYRARP VVTVDAQLLLHPMI IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQS IANTIVLMAE VQLD 
S LKQKGAI KRTLFLDNHQ AHRVFPLVIKRQKSHQCQDF I VYLRD 
ETEFRDKIiSPINISLNYSLDESTFKEGLEVKPILNYYRENIVSE 
QAHI LVDCGEDNLCVPDLKLSARPDKHQVI IGDENHLMLI INAR 
NEGEGAYEAELF VM I PEEADYVG I E RNNKG FRP LSCEYKMENVT 
RMWCDLGNPMVSGTNYSLGLRFAVPRIiEKTNMSINFDLQIRSS 
N KDN PDSNFVS LQINITAVAQVE IRGVSHPPQI VLP IHNWEPEE 
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Amino acid segment containing signal peptide 
{A-Alanine, O Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Iiyslne, 
L=»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T«Threonine, V=Valine, 
WoTryptophan, Y-Tyrosine, X-=Unknown, *-Stop 
Codon, /opossible nucleotide deletion, 
\»possible nucleotide insertion) 








EPHXBEBVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEFL 
IiY I FHIQTLG PLQCQ PNPNINPQDI KPAASP2DT PE L>S AFLRNS 
TI PHLVRKRDVHWEFHRQSPAKI LNCTNI ECLQI SCAVGRLEG 
GESAVLKVRSRLWAHTPLQRKNDPYAIASLVSFEVKKMPYTDQP 
AKLPEGS IAIKTSVIWATPNVS PS IPLWVTILAILLGLLVLAIL 
TLALWKCG PFDRAR PPQEDMTDREQLTNDKT PEA 


5959 


1 


1166 


GTSGYAAQQLPSLLKEREFHLGTI^KVFASQWLNHRQVVCGTKC 
NTLFWDVQTSQI TKI P I LKDREPGGVTQQGCGIHAIELNPSRT 
LIiATGGDNPWSLMYRLPTLDPVCVGDDGHKDWIPSIAWISDTM 
AVSGS RDGSMGLWEVTDDVLTKS DARHNVS RVPVYAH I THKALK 
DIPKEDTNPDNCKVRAI*AFNNKNKEI/»VSLDGYFHLWKAErnX 
S KLLS TKLP YCRENVCLAYGSE V?S VYAVGSQAHVS FLDPRQPS Y 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFL 
EERLSACYGSKPRLAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTHC YDS SGTKL FVAGGPL PS GLHGNYAGLWS 


5960 


2853 


870 


FWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATttPLNKB 
LDWASINGPCEQIiNEDPEGPPLATRLLAHKIQSPQEWEAIQALT 
VLETCMKSCGKRFHDE VGKFRFLNEL I KWS PKYLGS RTS EKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVRSDPKLPDDT 
TFPLPPPRPKNVI FEDEE KSKMLARLLKSSHPEDLRAANKL1KE 
KVQEDQKRMEKI S KRVNAIEEVNNNVKLLTEMVMSHSQGGAAAG 
SSEDL\MKBL\YQRCERMRPTLFPTGRVDTEDND\EAIiAEItiQA 
NDNLTQ VINLYKQLVRGE E VNGDATAGS I PGSTS ALLDLSGLDL 
P PAGTTYPAMPTRPGEQASPEQPS AS VSLLDDE LMSLGLSD PTP 
PSGPSLDGTGWNSFQSSDATEPPAPALAQAPSMESRPPAQTSLP 
ASSGLDDLDIiIXSKTTLI^SLPPBSO^VRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LES IKPSNILPVTVYDQHGFRILPHFARDPLPGR5DVLVVVVSM 
LSTAPQ P I RN I VFQS AVP KVMKVKLQ P PSGTE L PAFN P I VHP S A 

ITQVLLLANPQKEKVRLRYKLTPTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGE PRPEPGNMAT CIGEK I EDFKVGNLLGKGS FAG VYRAES I HT 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMHRYLKNRVKPFSENEARHFMHQI IT 
GMLYLHS HG I LHRDLTLSNLLLTRNMN I KIADFGIATQL KMPHE 
KHYTLCGTPNYISPEIATRSAHGLESDVWSLGCMFYTLLIGRPP 
FDTDTVKNTLNKWLADYEMPTFLS IEAKDLIHQLLRRNPADRL 
SLSSVLDHPFMSRKSSTBCSKDLGTVEDSIDSGHATISTAITASS 
STS I SGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNSGRGRV I QDAEERPHSRYLRRA YSSDRSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTB 
TVQQWFGNIX) I NAHLRKTTE YDS ISPNRDFQGHPDLQKDTS KNA 
WTDT KVKKNSDAS DNAHS VKQQNTMKYMTALHS KPE I IQQECVF 
GSDPLSEQSKTRGM3PPWG YQNRTLRS I TSPLVAHRLKP IRQKT 
KKAWSILDSEEVCVELVKEYASQEYVKEVLQISSDGNTITIYY 
PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNIiPEKYWRKYQYA 
SRFVQLVRSKS PKI TYFTRYAKCILMBNSPGADFEVWFYDGVKI 
HKTEDFIQVTEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 
ICLALESI ISEEERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 
VDSNY PTRDRAS FNRMVMHSAASPTQAP I LNPSM VTN3GI/GL7T 
TASGTD I SSNSLKDCLPKSAQLLKSVF VKNVGWATQ\ LTSGAVW 
VQFNDGSQLWQAGVSS I S YTS PNGQ\ TTR\ YGBNEKLPDYI KQ 
KLQCLSS ILIJ^FSNPTPNFH 


5962 


20 


2447 


rvcsssastasqavMadawbeIr^^ 
ncieivnkliaqkqlewhtldgkeyitpaqiskemrdelhvrg 

GRVN I VDLQQ V INVDL I HIE WRIGDI I KSE KHVQLVLGQLIDEN 
YLDRLAEE VNDKLQESG QVT I SELCKTYDLPGNFLTQALTQRLG 
RI ISGHIDLDNRGVI FTEAFVARHKARIRGLFSAITRPTAVNSIi 
ISKYGFQEQLLYS VLEELVNSGRLRGT WGGRQDKAVFVPD I YS 
RTQSTWVDS FFRQNG YLEFDALSRLGI PDAVSYI KKRYKTTQLI* 
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SEQ 
ID 
NO: 


| Predicted 

beginning 
1 nucleotide 

location 
I corresponding 

to first 

amino acid 
1 residue of 
1 amino acid 
| sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing « -1 rm ^ 1 — j- — i 
{A=Alanine, C«Cysteine, D=*spartic Acid, Es 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
Pt=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
^Tryptophan, Y»Tyrosine, X«Unknown, *»Stop 
Codon, Apossible nucleotide deletion, 
\ -possible nucleotide insertion) 


5963 






FLKAACVGgGLVDQVEASVEEAISSGTWVDIAPLLPTSLSVEDA 
AILLQQVMRAFSKQASTWFSDTVWSEKF\ INDCTBL FRELMH 
Q KAEKEMKNNP VHL I TEEDXiKQ I STLBSVSTS KKDKKDERR Rtf A 
TEGSGSMRGGGGGNAREYKIKKVKKKGRiCDDDSDOESQSSHTGK 
KKPEISFMFQDEIEDFLRKHIQDAPEEFISE1AEYLIKPLNKTY 
LBWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALTKHIjLKS VCTD ITNIiIFNFLAS DLMMAVD DPA 
AITSEIRKKILSW^SEETKVALTICLHNSLNEKSIEDFISCLDSA 
AEACD I MVKRGDKKRERQ ILFQHRQALAEQLKVTEDPAL I LHLT 
SVLLFQFSTHSMLHAPGRCVPQI I AFLNSKI PEDQHALLVKYQG 
LWKQLVS QS KKTG QGDY PLNNELDK3QEDVASTTRKE LQELSS 
S I KDL VLKSRKSS VTEE 


5964 


62 


1130 


VVt$i PQDFPGNRUbMG \QKGE IGP P \GQQGKKGAPGMP \GLMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSPGEPGY 

mglpgiqgkkgdkgnqgekgiqgqkgengrqgipgqqgiqghhg 

AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GPPGIiDGKPGREFSEQFIRQVCTDVIRAQLPVLLQSGRIRNCDH 
CLS QHGS PG I PG P PGP I G PEGPRGLPGLPGRDG VPGLVGVP GRP 
GVRG LKGLPGRNGEKGS QG FGYPGEQGP PGPPGPEGPPG IS KEG 

PPGDPGLPGKDGDHGKPG1QGQPGPPGICDPSLGFSVIARRDPF 
RKGPNY 




3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGEVTFENVKEI FGQTI IHHHI PFNWDCE F IRLHFGHNR 
KKHLN YTEFTQFLQELQLEHARQAFALKDKSKS GMI SGLDFSD I 
MVTIRSHMLTPFVEENLVSAAGGSISHQVSFSYFNAFNSLLNNM 
ELVRKI YS TLAGTRKDAE VTKE EEAQSAIRYGOATPIiEIDI L YQ 
LADLYNASGRLTLADIERIAPIAEGAI J PYl^AT7T.n»nnci>r'T^n 

PIWLQIAESAYRFTLGSVAGAVGATAVYPIDLVKTRMQNQRGSG 
SWGELMY2CNSFDCFKKVLRYEGFFGLYRGLIPQIjIGVAPBKAI 
KLTVKDFVRDKFTRRDGSVPLPAEVLAGGCAGGSQVIFTNPLEI 
VKZRLQ VAGE I TTG PR VSAIiNVLRDLGI FGLYKGAKAC FLRD I P 
FSAIYFPVYAHCKLLLADENGHVGGLNIJJ^GAMAG\VPAASLV 

tpadviktrlqvaaragqttysgvidcfrkil\reegpsafmkg 

TAARVFRSS PQFG \ VTLVT YELLQRG FYIDFT3GLKPAGSEPTFK 

SRIADLPPANPDHIGGYRLATATFAGIENKFGLYLPKFKSPSVA 
WQPKAAVAATQ 


5965 


1 


1498 


MVT^ YRFLPTS^MAAJCLRSLLP PDLRLQFWLHARl^k 
CGSYCAGAKASPLPGKMAMGLMCGRRBLLRLLQSGRRVHSVAGP 
SQWLGKPLTTRLLFPAAPCCCRPHYLFLAASGPR5T, QTcarcpi 

EVQVQAPPWAATPSPTAVPEVASGETADWQTAAEQSFAELGL 
GS YTPVGLIQNLI»EFMHVDLGLPWWGAIAACTVFARCIjI FPL1 V 

tgqreaariiinhlpeiqkfssrireaklagdhieyykassemal 

YQXKHGI KLYKPL 1 L P VTQAP I FI SFF ZALR BMANLPVPS LQTG 
GLWWFQDLTVSDPIYILPLAVTATMWAVLELGAETGVQSSDLQW 
MRNVIRMMPL I TLP I TMHFPTAVFMYWLSSNLFSLVQVS CLR I P 

AVRTVLKIPQRWHDLDKLPPREGFLESFKKGWKNAEMTRQLRE 

reqrmrnqlelaargplrqtfthnpllqpgkdnppnipssVsss 
s skp kskyp whdtlg 




102 


1925 

] 
( 
< 
i 


rskqvmarltkrrqadtkaiqhlwaaieiirkqkqianidritk 
ymsrvhgmhpkettrqlslavkdg li vetltvg ckgs kag i eq e 
gywlpgdeidwetenhdwycfechlpgevlicdlcfrvyhskcl 

SDEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 
RAI DLNKKGK1)NKHPMYRRLVHS AVDVPTIQEKVNEGKYRS YEE 
PKADAQLLLHNTVI FYGADSEQADIARML YKDTCHEL \DE LQL C 
KNCFYLANARPDNWFCYPCIPNHELDWAJ<MKGFGFWPAKVMQKB 
DNQVDVRFFGHHHQRAWIP3ENIQDITVNIHRLHVKRSMGWKKA 

:delelhqrflregrfwksknedrgeeeabssisstsneqlkvt 

2EPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
WSTQTKXLSASSPRMIiHRSTQTTNDGVCQSMCHDKYTKI FNDF 



414 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 
NO: 



Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide" 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H«Hietidine, I^Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glut amine, R«Arginine, 
S=»Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X» Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 
KDRMKSDHKRBTERWREAIiE KJjKSEMKE EKRQAVN KAVANMQG 
EMDRKCKQVKEKCKEEPVEE I KKLATQHKQL I SQTKKKQWCYNC 
EEEAMYHCCWNTS YCS I KCQQEHWHAEHKRTCRRKR 



~8T 



1288 



59$9 



1126 



503 



4712 



RSKQVMARLTKRRQADTKAI QHLWAAIHllUKUKQIANIDRITK" 
YMS RVHGMHPXETTRQLS LAVKDGLI VETLTVGCKGS KAG I EQE 
G YWLPGDE IDWETENHDW YCFECHLPGEVLI CDLCPRVYHSKCIi 
SDE FRLRDS S SPWQCP VCRS I KKKNTNKQEMGTYLRFI VSRMKE 
RAIDIaNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQIiLLHNT VI F YGADSEOADI ARMLYKDTCHEI»\ DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKB 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
C DELE L HQR FLREGR FWKS KNE DRGE EEAES S I S STS NEQLKVT 

QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRBTERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKE KCKEE F VE E I KXLATQHKQLI SQTKKKQWCYNC 
EEEAMYHCCWNTS YCS IKCQQEJHWHAEHKRTCRRKR 



VRFPRRGGAPPTVLTPGRQCKSVFt^PQRPGSEPDIPARGQPHPP 
RPVGVSTSAQAQVQPPAMHRRRIALGWFCLLACTSLSVLWVYI, 
ENWLPVS YVP Y YLPCPE I FNMKLHYKREKPLQPWWSQYPQPKL 

LEHRPTQLLTLTPWLAPIVSEGTFNPELLQHIYQPLNLTIGVTV 
FAVGN/HFLES AEE FFKRG YRVH Y YI FTDNPAAVPGVPLGPHRL 
LSSIPIQGHSHWEETSMRRMETISQHIAKRAHREVDYLFCLDVD 
MVFRNPWGPETIiGDLYAAIHPSYYAVPRQQFPYERRRVSTAFVA 
D SEGDFYYGGAVF GGQVARVYEFTRG CHMA ILAD KANG IMAAWR 
EESHIiNRKFI SNKPS KVLS PE YLWDDRKPQPPSL KLI RFSTLDK 
DISCLRS 



DVGt-NIKRKRCDLDVFLESPRXPSGRRDRAPEKQRRIAANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIVVLFGISITGGLFYTI 
FXELFSSSS PSKI YGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKQGTVYAQVKENP 
GSGEYDFRYIFVEIESYPRRTIIIEDNRSQDD 



SQDNIGHRLLQKHGWKIiGQG LGKSLQGRTDPI P I WKYDVMGMG" 

RMEMEliDYAEDATBRRRVLEVEKEDTEELRQKYKDYVDKEKAIA 

KALEDLRANFYCELCDKQYQKHQEFDNHINSYDHAHKORLKDLK 

QREFARNVSSRSRKDEKKQEKALRRIiHELAEQRKQAECAPGSGP 

MFKPTTVAVDEEGGEDDKDBSATNSGTGATASCGLGSEFSTDKG 

GPFTAVQ1TNTTGLAQAPGLASQGISFGIKNNLGTPLQKLGVSF 

S FAKKAP VKLES I AS VFKDHAE EGTSEDGTKPDEKSS DQGLQKV 

GDSDGSSNLDGKKEDEDPQDGGSLASTLS KLKRMKREEGAGATE 

PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 

KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 

KAE AKKALGGD VSDQS LE S HS Q KVS ETQM CE SNSS KETSLAT PA 

GKESQEGPKHPTGPFFPVLSKDESTALQWPSELLIFTKAEPSIS 
YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 
PKKSKEVGGEKIVRSSGGRMDAPASGSACSGIiNKQEPGGSHGSE 
TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 
PPRRR RRAQDDSQRRSLPAEEGS SGKKDEGGGGSS SQDHGGRKH 
KGELPPSSCQRRAGTKRSSRSSHR5QPSSGDEDSDDASSHRLHQ 
KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSS5S 
DASSDQSCYSRQR8YSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 
SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 
SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 
KRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPBDKNSVTAKLLL 
EKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 
LGNKPVLPLIGKLPATRKPNKKCEESGL3RGEBQEQSETEEGPP 
GSSDALFGHQFP\SEBTTGPLLDPPPEESKSGEVTAI>HPVAPLG 
PPAHFDCYLGDPTISHNYLPDPSDGNTLESLDSSSQPGPVESSL 
LP IAPDLEHFPSYAPPSGDPS I ESTDGAEDA\SIiAPLESQPI TF 
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SEQ 

JLU 

NO: 


1 Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 

amino «r»"ir? 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid - 
sequence 


Amino acid segment containing signal peptTcte^ 
(Alanine, Cysteine, D«Aspartic Acid, 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H-Histidine, I-Isoleucine, K«Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X-Unknown, *oStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5971 






TPKEMEK YS KLQQAAQQHI QQQLLAKQVKAFPASAALAPATPAlT"" 

QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGIHPHPHPQ 

PLAQVHHIPQPHLTPISLSHLTHSIIPGHPATPIASHPIHIIPA 

SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPr 

FSGQDLQHPPSHGT 


5972 


53 


2149 — 


s YFVG VDMDNP 1 GNWDGRFDG VQL CS FACVEST I LLH IND 1 1 
PES VTQE RRp PKLAFMSRG VGDKGSSS HNKPKATGS TSDPGNRN 
RS2LFYTLNGSSVDSQPQSKSKNTWYIDEVAEDPAKSLTEISTD 
FDR3 S P PLQ P PP VNS LTTENRFHSLPFSLTKM PNTNGS IGHS PL 

SI^AQSVMEELNTAPVQESPPLAMPPGNSHGLEVGSLAEVKENP 
P F YG VIRW I GQP PGLNEVLAGLELEDB CAG \ CTDGTF / R EGTR Y 

FTCAIjKKALFVKLKSCRPDSRFASLQPVSNQIERCNSLAIWEAY 
LSEWEENTPTQKWEKEGLEIM IG\KKKGIQGHYNS CYLDSTLF 
CLFAFSSVLDTVLIiRPKEKNDVEYYSETQELLRTEIVNPLRIYG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHILRV 
EPLLKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 
SNLKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGIiAMYECRECYDDPDISAGKIKQFCKTCNTQVHLHP 
KRLWHKYNP VSIjPKDLPDWDWRHG CIPCQNMEL FAVLCI ETSH Y 
VAFVKYGKDDSAWLFFDSMADRDGGQigGFNIPQVTPCPEVGEYL 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTQSPTM3LYK 


5973 


440 


17*1 


ILIAGSPSPRDQCSQRQSSGGDKELVTRGCTFSTAWSPSAMTQ 
EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PS AI VS FTVSRRNANVI PNFQI LF VS TFAVTTTCLI WFGC KLVL 
NPSAININFNLILLLLLELLMAATVI IAARSSEEDCKKKKGSMS 
D S ANI LDEVP F P AR VL KS YS WE VI AG I SAVLGG 1 1ALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TS PL LFTASG YLS FS I MR I VEM FKDYPPAI KPS YDVLLLLLLLV 
LLLQA/ GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 

GQEPPEGVRQGESLESRRGANGPVTPRRGNRVAAPSLAPGMETH 
NP 


5974 


65 


•2007 


NGDGKDLFGH I WAWRSNOl 1 S'.Kl VRRS PHAGMAEDE^DAKSPKTG 
GRAP PGGAE AGEPTTLLQRLRGT I S KAVQNKVEGI LQDVQKFSD 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNBEYMYAYRWIRNHLEE 
HTDTCL PKQSVYDAYRKYCESLACCRPLSTANFGKI IRE I FPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
E VTPA PRDELVEAACALTCDWAERI LKRS FSS I VEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
XKPERLAQPPKDLEARTGAGPLARGERKKS WES SAPGANNLQV 
NALVARLPLLLPRAPRSLI PPI PVSPPILAPRLSSGALKVATLP 
LSSRAGAPPAAVPIINMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTENREVGtGGDQGPHDKGVKRTAEVPVSEASGQAPPAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRIi 
PWETWGSGGEGNSAGGAERPGPMGEAEKGAVLAQG\QGDGTVSK 
GGRGPGSQHTKEAEDKIPLVPSKVSVIKGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQS SLS QEH KDPKATP P 




4293 


2200 

1 
1 
I 
I 
1 


LGLgymiTSGK IHQAM VTS* LhfEDftJES VT VE WI BNGDTKGK \B ID " 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
D IS P VQAAKKE PG PPSRRKSNCVKEVEKIiQE KREKRRLQQQELR 
EKRAQDVDATNPNYE I MCMIRDFRGSLDYRPLTTAJDP I DEHR I C 
w ^-vrtr^rijiN^xuiiyi^iVX/ijiJV J, 11 JroKUVVrTvHEPKQiCVDLTRYL 

en qt frfd yafdds apnemvyr ftarplvet i fergmatcfayg 
3tgsgk™tmggdfsgjwqdcskgiyalaardvfIi^kkpwykk 

[» ELQ VYATFF E I YSGKVF1)IJjNRKTKLRVLEDGKQQVQVVGLQE 
^EVKCrajVLKlIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 

<gklhgkfslidlagnergadtssadrqtrlegaeinksllalk 

5CIRAL3RKKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
IAS CENTLNTLRYANRVKELTVDPTAAGDVRP IMHHPPNQ I \DD 

jBTQWGVGSS pqrddlkllceqneeevspqlftfheavsqmvem 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=»Cysteine, DoAspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=»Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V«Valine, 
W»Tryptophan, Y-Tyrosine, ^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








EEQWEDHRAVFQES I RWLEDEKALLEMTE E VDYD VDS YATQLE 
AILEQKIDILTBLRDKVKSFRAALQEEEQASKQINPKRPRAL 


5975 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID 
I*ESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\AS I KNDPPS \RDNRWGSARARPSQFPEQPSS AQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
EKRAQDVDATNPNYE I MCMI RD FRGSLD YRPLTTADP 1 0EHRI C 
VCVRKRPLNKKETQMKDLDVITX PSKDWMVHEPKQKVDLTRYL 
ENQTFRFBYAFDDSAPNEMVYRFTARPLVETIPERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCSKGIYALAARDVFIiMLKKPNYKK 
LELQVYATFFErYSGKVFDI.T»NRKTKLRVLEDGKQQVQWGLQE 
RE VKCVEDVLKLI D I GNSCRTSGQTSANAHS SRSHAVFQ I ILRR 
KGKLHGKFSLIDLAGNSRGADTSSADRQTRLBGAE1NKSLLALK 
ECIRAIjGRNKPHTPFRASKLTQVIiRDSFIGENSRTCMIATISPG 
MAS CENTLNTLRYANRVKELTVDPTAAGD VRP I MHH PPNQI \ DD 
LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 

eeqwedhravfqbs irwledexallemteevdydvds yatqlb 
aileqkidxltelrdkvksfraalqeeeqaskqinpkrpral 


5976 


20 


2949 


VHHLHLTRVSVWNLDIILRIAQQMGlKTJLNLVIiG\LKRA\LEF 
PEVS WMEVKD PNMKGAMLTNTGKYAI PTIDA\ EAYAXGKKEKPP 
PLPEEPSS SSEEDDPI PDELLCLICKOIMTDAWIPCCGNS YCD 
ECIRTALLESDEHTCPTCHQNDVSPDALIA^KFLRQAVNNFKNE 
TG YTKRXiRKQLPS P PP P I PPPRPLI QRNLQPLMRS P I SRQQDPL 
MI P VTSS S THPAPS 1 S S LTSNQS SLAPP VSGNPS SAPAP VPDI T 
ATVS IS VHSEKSDG PFRDSDNKII#PAAALASEHS KGTS S I AI TA 
LMEEKGYQVPVX^TPSLLGQSLJjHGQLIPTTGPVRINTARPGGG 
RPGWEHSNKLG YLVS P PQQIRRGERS CYRS INRGRHHSERSQRT 
QGPSLPATPVFVPVPPPPLYPPPPHTIiPLPPGVPPPQFSPQFPP 
GQP \ P PAGYSV? P PG FP PAPANLSTP WVSSG VQTAHSNTI PTTQ 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRSFSRSKSPVSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 
RSHSRSYSRSPPYPRRGRGKSRNYR5RSRSHGYHRSRSRSPPYR 
RYHSRSRSPQAFRGQSPNKRKVPQGETEREYFNRYREVPPPYDM 
KAY YGRS VDFRD P FEKER YREWBRKYREWVEKYYKG YAAGAQPR 
PSANRENFSPERFT.PLNIRNSPFTRGRREDYVGGQSHRSRNIGS 
N YPEKLSARDGHKfQKDNTKS KEKES ENAPGDG KGNKHK KHR KRR 
KGEESEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDEPMDAESITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKENIVKPAKGPQEKVIXj\DVRDL1J3LNL\QIjKKPKEETPKDL 
TILNHHLPLRRMKKSL\EPP\EKLTLNQQK\TPRNKTSQRGKSE 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHSINHILHPGAGVAAG^PAtGW/REYiLT 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQME YSDELBAI IEEDDGDGGWV 
DTYHNTG I TG I TEAVKE I TLENKDNIRLQDCSALCE EEEDEDEG 
EAADM EE YEESGLLETDEATLDTRKI VEACKAKTDAGGEIlA I LQ 
TRTYDLYlTYDKyYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVTIENHPHLPPPPMCSVHPCRHAEVMKKIIETVAEGGGEL 
GVHMYLLIFLKFVQAVIPTI EYDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQS PLTWAP6 F^YRft FDLATSGRRLRGQTAEPAGRQ 
RPRRE P EAMDEQSVES IAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELVNCRWAEEVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCAIiWGGMH 
GGHTFKPLAEIYBQHVTKVNEEVAKLRRRLMELISLVQEVERNV 
EAVRNAKDER VRE I Rl^VEMMI ARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSELISKSSEILMMFQQVHRKPM 
ASFVTTPVPPDFTSELVPSYDSATFVLBNFSTLRQRADPVYSPP 
LQVSGLCWRLKVYPDGNGWRGYYLSVFLELSAGLPETSKYEYR 
VEMVHQSCNDPTKNI IREFASDFEVGECWGYNRFFRLDLLANBG 
YLN^QNDTVILRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
INNLKERliT I ELSRTQKSRDLS P PDNHLS PQNDDALETRAKKS A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricUiULcu end 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(AsAlanine, CoCysteine, D=Aepartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=»Glycine, 
H*Histidine, I«Isoleucine, K-Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R«Arginine, 
S=Serine, TVThreonine, V*Valine, 
W=Tryptophan, Y«Tyrosine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CS DMLLER \G P YSAS \VREAKEDEEDEEKIQNEDYHKELSDGDL " 

DLDLVYEDE VNQLDGS S S SAS STATSNTEENDI DEETMSGENDV 

EYNNMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 

ATSSLLD IDPLILIHLLDLKDRSS I ENLWGLQPRPPASLLQPTA 

SYSRKDKDQRKQQAIWRVPSDLKMLKRIiKTQMAEVRCMKTDVKN 

TLSEIKSSSAASGDMQ7SLFSADQAALAACGTENSGRLQDLGME 

LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 

NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 

DRQCKALDSDAVWAVFSGLPAVEKRRKMVTLGANAKGGHLEGL 

QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 

CEEHTS VGG FHDS FM VMTQ PPDEDTHSS FPDGEQIGPEDLS FNT 

DENSGR 


5979 


212 


3665 


LPDM'l'W YliW LKliLAFGFAFliDTE V ir' V TCJOS PTPS PTOAYLNASE 
TTTLS P SGS AVI STTTI ATTPSKPTCDEKYAN I TVDYLYNKETK 
LFTAKLNVNENVE OGNNTCTNNEVHNLTECKNASVS I S HNS CTA 
PDKTLILDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 
DTQN I T YRFQCGNM I FDNKEI KIiENLE PEHE YKCDS E ILYNS HK 
FTNASKI IKTDFGSPGEPQIIFCRSEAAHQGVITWNPPQRSFHN 
FTLCY I KETE KDCLNLD KNL I KYD LQNL KP YTK YVLS LHAY 1 1 A 
KVQRNGSAAMCHFTTKSAP PSQVWNMTVSMTS DNSMHVKCRP PR 
DRNGPHERYHLEVBAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFILHHSTSYNSKALIAFLAFIiI IVTSIALLW 
LYKI YDLHKKRS CNLDEQQELVE RDDE KQLMNVE P IHAD I LLET 
YKRKIADEGRLFIiAEFQS I PRVFS K FP I KEARKP FNQNKNRYVD 
ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DBT VDDFWRM I WEQKATVI VMVTRCEEGNRNKCAE YWPSMEEGT 

rafgeccckdltkhkrcpNdyiiqiolnivnkkekatgrevthiq 

FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIWHCSAGVGR 
TG TYI G I DAM LEGLEAEN KVD VYG YWKLRRQRCLMVQVEAQ Y I 
LIHQALVE YNQFGETEVNLS ELHP YLHNMKKRDP PS E PSPLEAE 
FQRLPS YRSWRTQHIGNQE \ENKSKNRNSNVI PYDYNRVPLKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVM1 
AAQGPLKETIGDFWQMI FQRKVKVIVMLTELKHGDQE I CAQ YWG 
EGKQT YGDIE VDLKDTDKSS TYTLRVFEIjRHSKRKDS RTVYQ^q 
YTNWSVEQLPAEPKELISMIQVVKQKIiPQKNSSEGNKHHKSTPL 
LIHCRDGSQQTGIFCALLNLLESAETEEVVDIFQVVKALRKARP 
GMVSTFEQYTQFLYDVIASTYPAQNGQVKKNNHQEDKIBFDNEVD 

KVKQDANCVNPLGAPEKLP EAKEQAEGSEPTSGTEGPEHSVNGP 
ASPALNQGS 


5980 


3 


2363 


DAWGCKLRRLR FT YGTQTR VS LALPGQ YEIj VHTIiVAHQGIWBTI 
PEEDLBVQENNEDAAHDLTELEVTMHHALLQEVDVVVAPCQGLR 
PTVDVLGDLVNDFIiPVITYAIiHKDELSERDEQELQElRKYFSFP 
VFFFKVP KLGSE 1 1 DSSTRRMESERfi PL YRQLIDLG YLSSSHWN 
CX?AF<jQDTKAQSMIjVEQSEKLRHLSTFSHQVLOTRLVDAAKALN 
LVHCHCLDI FIKQAFDMQRDLQ ITPKRLBYTRKKENELYESLMN 
I ANRKQEEM KDM I VETLNTMKEELLDDATNMEFKDVI VPENGE P 
VGTRE I KCCIRQ IQELI ISRLNQAVANKLI S S VD YLRES FVGTL 
ERCLQSLEKSQDVS VHI TSNYLKQ ILNAAYHVE VTFHSGS SVTR 
MZiHEQI KQ 1 1 QRIT W VS PPAI TLE WKRKVAQBAIESLSAS KLAJC 
SICSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTBDIiWLRVRK 
DHAPRLARLSLESRSLQDVIjLHRKPKLGQELGRGQYGVVYLCDN 
WGGHFPCALKSWPPDEKHWNDLALEFHYMRSLPKHERiiVDLKG 
S V1DYNYGGGSS IAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
DVVEGIRFLHSO^LVHRDIKLKNVLLDKQNRAKITDLGFCKPEA 
MMSGS IVGTPX HMAPELFTGKYDNS VD VYA FGI LFW YI QSG5VK 
LPEAFERCAS KDHLWNNVRRGAR PERLP VFDEECWQLMEACWDG 
D PLKRPLLG I VQPMLQG IMNRLCKS\NSEQPNRGLDDST 


5381 " 


1 


2519 

( 


3RKHSAAMEKPWGAADGLSRWPHGIX5LLLLLQLLP^STtsSDW~ 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
3\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDIjINNTFIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acia segment containing signal peptide 
(AcAlanine, C«Cysteine, D=*Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, Glycine, 
H-Histidinc, I-Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R«=Arginine , 
S»Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 


' 5982 " 






l rEPGMAIGPl^"SGKVVLTAEVS(K3SRGGRIPRSSDFAJOTVQTD 
LPFHPXiTQMMYS PQNSDYLLALSTENGIiWVS KNFGGKWEE IHKA 
VCLAKWGS DNTI FFTTYANGSCKRDLGAIjELWRTSDIiGKSFKTI 
GVECIYSFGLGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
S VGQEQ FYS ILAANDDMVFMHVDEPGDTGFGTI FTSDDRG I VYS 
KS LDRHL YTTTGGE TDFTNVTSLRGVYI TS VLSEDNS IQTM ITF 
DQGGRWTHLRKPENSECDATAKNKNECSLHIHASYS ISQKLNVP 
MAPLS E PNAVG I VI AHGS VGDAI S VMVPDVYI SDDGG YS WTKML 
EGPHYYTILDSGGIIVAIEHSSRPINVI1CFSTDEGQCWQTYTFT 
RDP I YFTGLASE PGARSMNIS I WGFTES FLTSQWVS YT I DFKD 1 
LERNCEEKDYTIWIiAHSTDPEDYEDGCILGYKBQFIiRLRKSSVC 
QNGRDYWTKQPSICLCSLEDFLCDFGYYRPSNDSKCVEQPELK 
GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQNSKSNSVPI ILAI VGLMLVTWAGVLIVKKYVC 

GGRFLVHLYSVLQQH\AEA\NGVDGVDAI»DTASHTNK5GYHDDS 
DEDLLE 


5983 


56 


231* 


ATRPPRG5 S WCRQFSRTASAAPGRSNMLR I PVRKALVGLS KS P K" "" 
GC VRTTATAASNLI EVFVDGQS VMVEPGTTVLQACEKVGMQI PR 
FCYHERLS VAGNCRMCLVEI EKAP K.WAACAMPVM KG WNI LTNS 
EKSKKAREGVMEFLI»ANHPLDCPICDQGGEC3)LQDQSMMFGNI)R 
S R FLEGKRAVEDKNIGPLVKT IMTRCIQCTRC I RFASBI AG VDD 
LGTTGRGNDMQVGTYIEKMFMSELSGNIIDICPVGALTSKPYAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRIIjPRMHEDIN 
EEM ISDKTRFAYDGLKRQRLTE PMVRNE KGLLT YTS WEDALSR V 
AGMLQS FQGKD VAAI AGGLVDAEALVALKDLIjNRVDSDTLCTE E 
VFPTAGAGTDliRSNYLIiNTTIAGVEEADWLLVGTNPRFEAPLF 
NARIRKSWLHNDIiKVALIGSPVDLTYTYDHLGDSPKILQDIASG 
SHPFSQVLKEAXKPMWLGSSALQRNDGAAILAAVSS IAQKIRM 
TSGVTGDWKVMKILHRIASQVAAIJ5LGYKPGVEAIRKNPPKVLF 
LLGADGGCITRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 
EKSATYVNTEGRAQQTKVAVTPPGLAREDWKI I RAIiSEIAGMTJj 
PYDTL\DQVRNRLEEVSPNLVRYDDIEG\ANYFQQANBI,SKLVN 
QQLLADPLVPPQLIWKDFYMTDS I SRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5984 


24 B 

755 1 


1763 


CARGDGGRRRHRASGRRAGRGKP\AGLK5QGQRAVPKRAVARGG 
RQ\YSAAIALLEPAGSEIADDLSILYSNRAACYLKEGNCSGCIQ 
E)CNRAL2LHPFSMKPLLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GIiQ LANDS VNRLSRILMELDG PNWREKLSL I PAVPASVPLQAWH 

pakemiskqagdssshrqo<3itdektf:<alkbegnqcvndknyk 

DALSKYSECLKINNKECAIYTNRABCYLKIiCQFEEAKQDCDQAI, 

QIADGNVKAFYRRALAHKGLKNYQKSL IDLNKVI LLDPS 1 1 E AK 

MELEEVTRLLNLKDKTAPFNKBKERRKIBIQEVNEGKEEPGRPA 

GEVSTGCLASEKGGKSSRSPEDPElOiPIAKPKNAYEFGQIINAL 

STRKDKEACAHI^AITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 

LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFBDL 
SDTPNNHFTLED I QALKRQYEL " ~ ' 


5985 




1193 


SSVCMAGTYVSNLK3KKQRSVSFLASGLMRVSTGPELRLHHSF^ 
TGDVGRRICRLLVGbFTKGDTSSKRVHPFSPGPCFLIiCDLARVG 
SSPKINVSPFYQN\QTSTQRSCTVFWQRCSLVGPFQVTVFTMY 
FHHSLRSISRFSSG 




22 


1408 

1 
] 
1 
1 

( 
t 


RRVAKPGTAEPAKARk^RkGRARRDLAGAERKAGVSERGDSGR " 
RRPNPS IPSAAAGMSHIQI PPGLTELLQGYrVEVLRQQPPDIiVE 
FAVEYFTRLREARAPASVLPAATPRQSLGHPPPEPG PDRVADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
FIPKTDEQRCRLQEACKDILLFKMLDQEQLSQVLDAMFERIVKAD 
SHVIDQGDDGDNFYVI ERGTYDI LVTKDNQTRSVGQYDNRGSFG 
SLALMYNTPRAATI VATS EGSL WGLDRVTFRR 1 1 VKNNAKKRKM 
PES FI ES VP LLKS LE VSERMKI VD VIGBKI YKR/DGER X I TQGE 
<\ADSFYII2SGEVSILIRSRTKSNKDGGNQEVEIARCKKGQYF 
5ELALVTNKPRAASAYAVGDVKCLVMDVQAFERLLGP CMD IKKR 
aSHYEEQLVKM FGSS VDLGNLGQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


^ o«s«w«t containing signal peptide 
<A=Alanine, C»Cysteine, D=Aspartic Acid, B» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 
~5TB7 




| 484 


DAHKSTS LTFHWKL WGRHRGRRRGLAH PKNHLS PQQGGATPQ V P 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLS CTLPNGFGGQSG PEGERSLAP PDAS I 
LISNVCSIGDHVAQEI,FQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVOSILDEFLQT\YGSLIPLSrDEWEKLEDIPQQEPSTP 
S RKGLVLQL I QS YQRMPGNAMVRGFRVA YKRHVLTMDDIjGTL YG 
Q^LNDQVMNM YGDL VMDTVP EK \ VHFFNS F FY \DKLRTKG YDG 
VKR WTKNVD I FNKELLIiI PIHLEVH5/SL I S VD VRRRT1 TYFDS Q 
RTLNRRCP KH I AKYLQAE AVKKDRLDFHQG WKG YFKMNVARQNW 

DSDCGAFVLQYCKHIiALSQPFSFTQQDMPKLRRQIYKELCHCKIi 
TV 




1806 


484 


DAWK^ TSLTFliWKLWGRHRGRRRGLRHPKNHIiS PQQGGATPQ VP 
SPCCRFDS PRGPPPPRLGLLfiAT.MaPTV^wcr'c DnreeAnmiBm 

GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQEI.FQGSDLGMAEEAERPGEK\AGQHSPIjRE 
EHVTCVQSILDEFLQT\YGSLIPLSTDEWEKLEDIFQQEFSTP 
SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
CNVOiNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYIX3 
VKRWTKNVDI FNKELLLIPIHLBVHWSLISVDVRRRTITYFDSQ 
RTLNRRCPKHIAKYIjOARJWTnfnP T.n?^or*uvr«vcvuMtnt 

DS DCGAFVI*Q YCKHLALS QP FS FTQQDMPKLRRQI YKELCHCKL 
TV 


5988 


1292" 


410 


FKKYFLS FLGLLESSHSRDRI HNIiVLMFlULATHNLVWWFTCRFQ 
RLDCI YLNAG IMPNPQLNI XALLFGLFS \AEGLLTQGDKI TADG 
LQB VFETDVFGHF I LIRELE P LLCHSDNPSQLI WTS SRNARKSN 
FSLEDFQHSKGKEPYSSSKYATI)LLSVALNRNPNQQGLySNVAC 
PGTALTNLTYGILPPFIWTLLMPAILLLRFFANAFTLTPYNGTE 
ALVWLFHQKPES LNPLI KYLS ATTGFGRNYIMTQKMDLDEDTAE 
KFYGKLLELEKH I RVTIQKTDNQARLSGS CL 


5989 


194 


2610 


AMDPPQHSQHVLEQLNQXJRQLGLLCDCTFVVDGVHFKAHKAVLA 
ACSE Y FKMIjFVDQKDVVHLD I SNAAGLGQVLEFM YTAKLSLS PE 
NVDDVL\AVATFLQMQDI I TACHAtiKSLAEPATS PGGNAEALAT 
EGGEKRAKEEKVATSTLS RLEQAGRSTP I GPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGM7\AAEAEAALSESS 
EQEMBVEPARKGEEEQKEQEEQEEEGAGPAEVKEEGSQLENGEA 
PEENENEESAGTDSGQELGSEARGIjRSGTYGDRTESKAYGSVIH 
KCEDCGKEFTHTGNFKRHIRIHTGEKPFSCRECSKAFSDPAACK 
AHEKTHS PLKPYGCEECX3KS YRLI SLLNLRKKRHSGEARYRCED 
CGKIjFTTSGNLKRHOLVHSGE KP Yrjrnvrra w c i?c noTcvunuT -o 

thdtdkehkcphcdkkfnqvgnlkahlkihiadgplkcrecgkq 
fttsgnlkrhlrihsgekpyvcihcqrqfadpgalqrhvrihtg 
ekpcqcvmcgkaftqassliahvrqhtgekpyvcercgkrfvqs 

SQLANHIRHHDNIRPHKCSVCSKAFVNVGDLSKHII IHTGEKPY 
LCDKCGRGFNRVDNLRSIIVKTVHQGKAGIKILEPEEGSEVSWT 
\TDDMVTIiATEALAATAVTQLTVVP VGAAVTADE TE VL KAE I S KA 
VKQVQEEDPNTHILYACDSCGDKFLDANSLAQHVRIHTAQALVM 

fqtdadfyqqygpggtwpagqvlqagelvfrprdgaegqpalae 

TSPTAPECPPPAE 


5990 


2 


4700 


fgpgpdsgggargsgwgsrsqapygtlgavsggeqvllheeagd 
sgfvslsrlgpslrdkdlemeelmlqdetllgtmqsymdaslis 
liedfgslgevemsi,pdpswdfsppsfletsspklpswrpprsr 
prwgqspppqqrsdgeeeeevasfsgqilageldncvssipdfp 
mhlacpeeedkataaemavpaagdesisslselvramhpyclpn 
lthlasledelqeqpddltiipegcwleivgqaatagdoleipv 

VVRQVSPGPRPVIMDSLETSSALQLIjMPTLESETEAAVPKVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
GSQRARI03RKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENSSPKN 
LERS AGQSS PAKEGPLDIiYPKLADTIQTNP I PTHLSLVDSAQAS 

pmpvdsveadptavgpvlagpvpvdpglvdlastsselveplpa 

EPVLIWPVIiADSAAVDPAVVPISDNLPPVjQAVPSGPAPVDLALV 
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1 SEQ — 

ID 
NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Aiiij.uu acia Begraent containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q^Glutamine, R=*Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y»Tyrosine, X-Unknown, *«Scop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








DPVPNDLTP VDPVLVKSRPTDPRRGAVSSALGGSAPQIiLVESES " " 

LDPPXTIIPEVKBWDSLKIESGTSATTHBARPRPLSLSEYRKR 

RQQRQAETE KRS PQPP1X3KWPSLPETPTGIADI PCLVI PPAPAK 

KTALQRS PETPLEI CLVP VGPSPASPSPE PPVSKPVASS PTEQV 

PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 

LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 

CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 

GPIiGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGXPA 

GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 

HKVS ALVQSPQM KALACVSA EG VTVEKPAS ERLKPETQETRPRB 

KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 

DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 

GNSGGVD I PQEKRPLDRIiQAPELANVAGLTPPATP PHQLWKPLA 

AVaxjunisAlsS JrKo 1 AQEGTLKPEGVTSlAlwrAAVRljQEGVHGPS 

RVHVGS3DHDYC\ VRSRTPPKK\MPALLI PEVGSRWNVKRHQDI 

TIKPVIjSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCLAPS 

SIiLS PEASPCRNDMNTRTPPEPSAKQRSMRC YRKACRSAS PSSQ 

GWQGRHGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 

HKRWR3SSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 

PS PRRRSDRRRRYSS YRSHDHYQRQRVLQKERA I EERRWFIGK 

IPGRMTRSELKQRFSVFGBIEECTIHFRVQGDNYGFVTYRYAEE 

nfAAl n&UrtivJjKy/UJciy ir r ULK-r vUKKUi* CKRS I bDUJSNREDF 

DPAPVKSKPDSLDFDTLLKQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCS PS I YC \TKFDKQGNVTS FERKKTELYQELGLQAR 
DLRFQHVMS I TVRNNRI IMRME YLKAVITP ECLLI LD YRNLNLK 
QWLFR2LPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKI, 
S ILOPLILRTIiDAIjGDP KHS S VDPS Kr.CTTLI^ttrcSJCQT .QP'T.F'Pn T 

. KIFKESILEILDEEEIJjEEIjCVSKWSDPQVFEKSSAGIDHAEEM 
ELL LEN YYRLADDLSNAARELRVL I DDSQS X X FXNLDSHRNVMM 
RLNLQLTMGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLITGI 
MFMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGGIVEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTRIiKWFAlCFVC 
GVFFS I LGTGLLWLPGG I KLFAVF YTLGNLAALASTCFLMGPVK 
Q LKKMFEATRLLAT1 VMLL CFIFTLCAA1»WWHKKGLAVLFC I LQ 

PL Q MTW Y^T. CVT PVTVPfi AVT tfOPR QT.T.fi 


5993 


1650 


594 


AEGLGSWAWAGLGWAGRHMEAGGATGALGVGCKLPSAFCFPGS 
SVAMDMFQKVEKIGEGTYGVVYKAKNRHTGQLVALKKIRLDLEM 
EGVPSTAIRB ISLLKELKHPNIVRLLDWHNBRKLYIiVFEFLSQ 
DLKKYMDSTPG SELPLHL IKS YLFQLLQGVS FCHSHR VIHRDLK 
PQNLLINEIX3AIKLADFGLARAFGVPLRTYTHEVVTLWYRAPBI 
UATRFYTTAVDIWSIGCIPAEP4VTRKAXiFPGDS\EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG 
RDLLMQLLQYDPSQR ITAKTALAHP YFS S PEP S PAARQYVLQRF 
RH 


5994 


394 


1934 


AGE VQLH VWIRGMRIQPQ/ XAAAX IDLDPDFEPQSRPRSCTWPX 
PRPE IANQPS KPPE VEPDLGE KVHTEGRSEP ILLPSRLPE PAGG 

POPnTT/3AVT£tPPTCftflQPPNAWf^rtCY2\.PT.TQn2lTPQaPPTrPT f 

LAQ I YE WMVRT VPY FKDKGDSNSS AGWKNS XRHNLSLHS KF I KV 
HNBATG KS SWWMLNPEGGKSGKAPRRRAASMDS SSKLLRGRS KA 
PKKKPSGLPAP PEGATPTS P VGHFA KWSGS PCSRNR EEADMWTT 
FRPRSSSNASSVSTRLSPLRPESEVLAE3IPASVSSYAGGVPPT 
LNEGUELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLFS PAEGPLS AGEGCFSS SQALEALLTSDTP P P PADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SM Z APPPVMASAPX PKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENLECDMDNI ISDLMDEGEGIiDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAWIiCTRARGSAAFVPPLPRPPSRGARRRRRIjPGR 
GVAALRRGPGSAPGLPRGRAERS AAG SGRGPSREERGAAAAAAA 
AEMMEEIiHSL\DP\RRQEIiLKARF\TGLGVSKGPIiNSESSNQSL 
CSVGSl^DKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 

MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A^lanine, C«Cysteine, D-Aepartic Acid, E«= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
HaHistidinc, I-Isoleucine, K=»Lysine, 
L« Leucine, M^Methionine, N=Asparagine, 
P«Proline, Q=Glutaraine, R=Arginine, 
s=Serine, T=Threonine, Vovaline, 
WoTryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








ISDYPERRVBQPLYGLDGSAAKE^TEEQ3AL?TL^VMLAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQISI 
QHRO/T\QSDLTIEKI SALENS KNSDIjEKKEGRIDDLLRANCDLR 
RQI\DEQQKMLEKYK\ERLNRCFDNEPRNFLI eks kqekmacrd 
KS MQDRLRLGHFTTVRHG AS F TEQ WTDG Y AFQNL I KQQERINSQ 
REEIERQRKMLAKRKPPAMGQAPPATNEQKQRKSKTNGAENBTL 
TLAEYHEQEEIPKLRIiGHLKKEEAEIQAELBRLERVRNLHIREL 
KRIHNEDNSQPKDHPTLNDRYLLLHLLGRGGFSBVYKAPDLTEQ 
RYVAVKIHQLNKNWRDEKKENYHKHACRE YRI HKELDHPRI VKL 
YD Y FSLDTDS FCTVLE YCEGNDLDFYLKQHKLMS EKEARS I IMQ 
IVNALKYLNEI KPPI IH YDLKPGNILLVNGTACGEI KITDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPXISNK 
VD V WS VGV I F YQCL YGRK PFGHNQSQQD I LQENTI LKATEVQFP 
PKPWTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPHIRKSV 
STS S PAGAAI AS TSGASNNS SSN 


5996 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS IVNEGYLNSASEGEEFCIYNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL \ ANQWQVS KP KDNPLN EGTDAS PGRPSPFS 
FFS I FTWS LTAALAVRRFKDLSFQEE YSTLFP \ ASAQP 


5997 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSV7IGSWTQR/SWVSWRSRPGCE 
LFSIWFGS I VNEGY LNSASEG E E F C I YNRN PNACS YGVAVGVL 
AFLTCLLYLALDVYFPQ I SSVKDRKK\AVLS GHPWSGE PHPAA 
FWAFLWFTGDSCYL\ANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFS I FTWSLTAALAVRRFKDLS FQEE YSTLFP \ ASAQP 


5998 


1612 


981 


DQQACLLGLMLTLEFG I LE FDP S W I GS W'l'QR / S W VS WRSRPGSB 
LFSIWFGS IVNEGYLNSASEGEEFCIYNRNPNACS YGVAVGVL 
AFLTCLL YLALDVY FPQ I SS VKDRXK\ AVLSGKIP WSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLN EGTDAS PGRPSPFS 
FFS I FTWSLTAALAVRRFKDLSFQEB YSTLFP \ ASAQP 


5999 " 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIVWGFHHKKGCQVEFSYPPLIP 
GIXSHDSHTLPBEWfOfLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFGI S CYR \QIEAKALKVRQAD ITRETVQKS VCVLSKL PL YG 
LLQAKLQL I THAYFEEKDFSQIS X LKELYEHMNSSLGGAS LEGS 
QVYLGLSPRDLVLHFRHKGLI LFKL I LLEKKVLFYI SPVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVS ASTADVSHTNLGT I RKVMAGNHGEDAAMKTEE PLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNLFP KDS VPS E S LP I TVQPQANTGQ WLI PGL I SG LE 
EDQYGMPLAI FT KG YLCL P YMALOQHHLLSDVTVRG FVAGATNI 
LFRQQXHLSDAIVEVBEALIQIHDPELRKLLNPTTADLRFADYL 
VRHVTENKUD V FLDGTG WEGGDEW IRAQFAVY I HALLAATLQL V 
LFR I VNVAXKI GNVMVTT\ SRNWQTGK\AVGQS VGGAFS \ S AK 
TA\MS S WLSTFTTS TSQSLTEPPDEKP 


6000 


101 


1561 


TEP CRTAEN CTATMS ENNKNSLESSLRQLKCH FTWNLMEG ENS L 

DDFEDKVFYRTEFQNREFKATMCNLLAYLKHLKGQNEAALECLR 

KAEELIQQEHADOASIRSLVTWGNYAVTVYYHMGRLSDVQIYVDK 

VKHVCEKFSSPYRI SSPELD CEEGWTH LKCGGNQNERAKVCFEK 

ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 

NQYLKVIiIJ^ICLHKMREEGBBEGEGEK\LVEEAI,EKAPG\VTDV 

jjko/vh \Nr iKvji\.UhiFUKAIhLLiU^ALBYIP\NNAYLHCQIGCC 

RAKVFQVMNLRBNGI^GKRKLLELIGHAVAHLKKADEANDNLFR 

VCS 1 LAS LHALADQ YEDAE YYFQKEFS KELTPVAKQLLHLR YGN 

FQLYQMKCEDKAIHHFIEGVKINQKSREKEKMKDKLQKIAKMRL 

SKNGADSEALHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 

SWNGE 


6001 1 


176 


1038 


AFAllSPSRGHRKTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 
WLHFDADGSGYLEGKELQNL I QELQQARKKAGLELS PEMKTFVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLFRCQQLKSCS\EFMKT 
WRKYDTDHSGFI ETEELKNFL KDLLEKANKTVDDTKLAE YTDLM 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D-Aspartic Acid, B« 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H— Hist idinf» t-TraI 01in j v j 

tiMiuc, -=j.Buaeucine, K= Lysine / 

L^Leucine, M*Methionine, N«Asparagine, 
P=»Proline, Q=Glutaraine, R=Arginine, 
S -Serine, T=Threonine, VsValine, 
W»Tryptophan, Y»Tyrosine, X»Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 


I 6002 






PELYr^TCNGYIDBNELDALLKDLCEKNKQDLDINNlTTYKKKI 
MALSDGGKLYRTDLALILCAGDN 


6003 


977 


61 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLrfPHS 
SMAQRSDLliEIjDCQLTRDRWWSHDENLCRQSGLNRDVGSLDP 
EDL PIiYKEKLE VYFS PGHFAHGSDRRMVRLEDLFQR FPRTPMS V 
EIKGKNEELIREQ/VLVRRYDRNEITIWASBKSSVMKKCKAANP 
EMPLS FT I SRGFWVLLS Y YLGLLPF X P I PEKFFFCFIjPNI INRT 

YFPFSCSCIiNQIiliAWSKWLIMRKSLIRHLEBRGVQWFWCLNE 
ES DFEAAFS VGATGVI TD YPTALRHYLDNHGPAARTS 


6004 


140 


4098 


GJCLRAFRGMRRLI CKRI CDYKS FDDEES VDUN RPS SAASAFKVP 
APKTSGNPANSARKPGSAGGPKVGAGASKEGGAGAVDEDDFIKA 
Vl'DVPS IQIYS SRELEETLNKIRE IL SDDKHD WDQRANAIiKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 
KLSTVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLLLQEWQTHSLE 

rhaavlvetikkgihdadaearvearktymglrnhfpgeaetly 

NSIiEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 

kwstanpstvagrvsagsskasslpgslqrsrsdidvnaaagak 

AHHAAGQSVRSGRLGAGALNAGSYASIiEDTSDKLDGTASEDGRV 

Raklsaplagmgnakadsrgrsrtkmvsqsqpgsrsgspgrvlt 
ttalstvssgvqrvlvnsasaqkrskiprsqgcsreaspsrlsv 

ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 

tgalyapevygasgpgygisqssrlsssvsamrvlntgsdveea 
vadalllgdirtkkkparrryesygmhsdddansdassacsers 

YS SRNGS IPT YMRQT \EDV\AEVLNRCASSN5fS ERKEGLLGLQN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHIO^DLQDWLFVLLTQLLKiCMGADLLGSVQAKVQKALDVTRES 
FPNDIiQFNILMRFTVDQTO/TPSLKVKVAILKYI ETLAKQMDPGD 
FINSS ETRLAVSRVITWTTEPKS SDVRKAAQS VL ISLFE LNTPE 
FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 
RSPAN WS S PLTSPTNTSQNTLSPSAFD YDTEKMNS EDI YSS L^G 

vteaiqnfsfrsqe^ineplkrdskkddgdsmcggpg\msdpra 

GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDHSDIiVAELLKRISNHNER 
VEERKIALYELMIOiTQEESFSVWDEHFKTILLLLLETLGDKEPT 

iralalkvlre ilrhqparfknyaeltvmktleahkdphxewr 

SAEEAASV\ LATS I \SPEQCIKVLCPI IQTADYPINLAAIKMQT 

KVI ER VS KETLNLLLPE impgliqg ydnsessvrkacvfclvav 

HAVIGDELKPHLSQLTGSKMKLLNL YI KRAQTGSGGADPTTDVS 
GQS 




140 


4098 

: 

< 
i 
i 


GiUikAFRGMRRLICKRlC'l) ¥ KS FDDEKSVDGNRPSSAASAFKVP 
APKTS GNPANSARKPGSAGGPKVGAGAS KEGGAGAVDEDDFI KA 
FTDVP S I Q I YSS RELEETLNKIREILSDDKHDWDQRANAIiKKI R 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQVVRBACI TVA 
HLSTVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLIiIiQEWQTHSIiE 
RHAAVLVETIKKG I HDAJDAEARVEARKTYMGLRNHFPGEAETL Y 
NSLEPS YQKSLQT YLKS SGS VAS LPQSDRS S S SSQESLNRPFS S 
KWSTANPSTVAGRVSAGSSKASSIiPGSLQRSRSDIDVNAAAGAK 
AHHAAGQS VRSGR LGAGALNAGS YASLEDTSDKLDGTAS EDGRV 

RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKI PRSOGCSREAS PSRLS V 
ARSSRIPRPSVSQGCSREASRB33RDTSPVRSFQPIASRHHSRS 
TGALYAP E VYGASGPG YGISQS SRL3SS VSAMRVLNTGSDVEEA 
VADALLLGDIRTKKKPARRR YES YGMHSDDDANSDAS S ACSERS 
if S SRNGS I PT YMRQT\ED V\AE VLNR CASSNWS ER KEGLLGL QN 
tiLKNQRTLSRVBLKRLCEIFTRMFADPHGKRVFSMFLETLVDPI 
JVHIGDDLQDWLFVIiLTQLLKKMGADLLGSVp^UCVQKALDVTRES 
'PNDLQFNI LMRFTVDQTQTPS LKVKVAI LKY I ETLAKQMDPGD 
'INSSETRIAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=*Aspartic Acid, 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q=Glutaraine, R^Arginine, 
S»Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMLLGALP KTFQDGATKLLHNliLfeNTGNGTQSSMGS PLTRPTP" 
RSPANWSSPLTSPTNTSQNTLSPSAPDYDTENMNSEDIYSSLRQ 
VTEAIQNPS FRSQEDMNEPLKRDSKKDDGDSMCGGPG \MSDPRA 
GGDATDS SQTAL\DNKASLLHSMPTHSS PRSRDYNPYNYSDS I S 
PFNKSALKEAMPDDDADQFPDDLSLDHSDLVAELJjKELSIJHKER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
IRALALKVLREILRHQPARFKNYAEIjTVMKTLEAHKDPHKEVVR 
SAEEAASV\LATSI\SPEQCIKVLCPIIQTADYPINLAAIKMQT 
KV I ER VS KETLNLLL PE IMPGLIQG YDNSE SS VRKACVPCLVAV 
HAVIGDELKPHLSQLTGS KMKLLNLYI KRAQTGSGGADPTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDALL" 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 

KRQKKERMLIiCRQLGDS SGEGPE FVEEEEE VALRSDS EGSDYTP 

GKKKKKKLGPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLL 

BDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPIjIAAKNPKIAVS 

KMMMVLGAKWR2 FS TNNP FKGS SGAS VAAAAAAAVAWESM VTA 

TEVAPPPPPVEVPIRKAKTKEGXGPNARRKPKGSPRVPDAKKPK 

PKKVAPLKI KLGGFGSKRKRSSSEDDDLDVESDFDDAS INSYSV 

SDGSTSRSSRSRKKIiRTTKKKKKGEEEVTAVDGYETDHQDYCEV 

CQQGGE1ILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 

QWEAKEDNSEGEEILEEVGGDLEEEDDHHMBFCRVCKDGGELLC 

CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 

WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFPVKWQGMS 

YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRIIjNHSVDKKG 

HVHYLIKWRDIjPYDQASWESEDVEIQDYDLFKQSYWNHRELMRG 

eegrpg kklkkvklrklerppetptvdptvkyerqpe yldatgg 

TLHPYQMEGU^RFSWAOJSTDTIIiADEMGLGKTVQXAVFLYSL 

ykeghskgpflvsaplstiin\werefemwapdmyv\vtyvgdk 
dsrai irenefs \ fednai rggkkasrmkkeas vkfh vllts ye 
lit idmailgs i dwacl i vde ahrl knnqs kffr vlngys lqhk 
llltgtplqnnleelfhllnfltperfhnlegfiieefadiaked 
qikiolhdmlg\phmlrrlkadvfknmpskteriiv\rvelspm\q 

KXYYK\ YI LHS KFLKALN \ARGKKINQVSIiLNVVMDI*KKCCNHP Y 
LFP VAAMEAPKMPNGMYDGSALI RAS GKLLLLQ KML KNLKEGGH 
RVLI FSQMTKMLDLLEDFLEHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAQQFCFLLSTRAGGLGINLATADTVI I YDSDWNPHNDIQ 
AFSRAHR IGQNKKVMI YR FVTRASVEERITQVAKKKMMLTHLVV 
RPGLGS XTGSMS KQELDD ILKFGTBELFKDEATDGGGDNKEGED 
S S V I HYDDKA IERLLDRNQDETEDTBLQGMNE YliS S FKVAQ YW 
REEEMGEEEEVERE II KQEESVDPDYWEKLLRHHYEQQQEDLAR 
NU5KGKR I RKQ VNYNDGSQEDRDWQDDQSDNQSDYS VAS E EGDE 
D FDERS EAPRR PSRKGLRNDKDKPLPPIiLARVGGNI E VLGFNAR 
QRKAFLNAIMRYGMPPQDAFTTQWIiVRDLRGKSEKEFKAYVSLF 
MRHLCEPGADGAE T FADGVPREGLSRQHVLTR IGVMS L I R KKVQ 
EFEHVNGRWSMPELAEVEBNKKMSQPGSPSPKTPTPSTPGDTQP 
NTPAPVPPAEDOIKIEENSLKEEESIEGEKEVKSTAPETAIECT 
QAPAPAS EDE KVWE PPEGEEKVEKAEVKERTEE PMETE P KGKG 
AADVEKVEE KSAI DLTPI WEDKEEKKEEEEKKE VMLQNGETP K 
DLNDEKQKKNIKQRFMFNIADGGFTELHSLWQNEERAATVTKKT 
YEIWHRRHDYWLIiAGIINHGYARWQDIQNDPRYAILNSPFKGEM 
NRC^FLEIKNKFLARRFKLLEOALVIEEQLRRAAYLNMSEDPSH 
PSMALNTRFAEVECLAESHQHLSKESMAGNKPANAVLHKVLKQL 
EELLSDMKADVTRLPATIARI PPVAVRLQMSERNILSRLANRAP 
EPTPQQVAQQQ 


*006' 


1 


965 


DNDFLRNTVHRHBPPVTAEP IRLLAENEDWWDKPSS I PVHPC 
GRFRHNTVI FILGKEHQLKELHPLHRLDRIiTSGVLMFAKTAAVS 
ERIHEQVRDRQLEKEYVCRVEGEFPTBEVTCKEPILWSYKVGV 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
LGHPILNDPIYNSVAWGPSRGRGGYIPKTNEELLRDLVAEHQAK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - * 
(A=Alanine, C=Cysceine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
IisLeucine, MsMethionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Argxnine, 
S=Serine, T»Threonine, V*»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








QSLDVIaDLCEGDLSPGLTDSTAPSSELGKDDIiEEIiAAAA\QKMS 
E VAE AAPQE1»DTI ALAS E KAVETDVMNQ \RQT\TLCRVPAG ATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


HELGQVEYVFTDKTGT1*TENEMQPRECS INGMKYQEINGRLVPE "" 
GPTPDSSEGMLS YLSSliSHliNNLSHLTTS SS FRTS PEN ETEJbl K 
EHDLFFKAVSLCHTVQINNVQTDCTGDGPWQSNLAPSQIiEYYAS 
SPDEKALVEAAARIGIVFIGNSEETMEVKTLGKLERYKLLHILE 
FDSDRRRMSVIVQAPSGEKLLFAKGAESSILPKCIGGEIEKTRI 
HVDEFALKGIiRTLCIAYRKFTSKEYEEIDKRIFEARTALQQR\E 
E KL AAVFQF I EKDLI LIX3ATAVEDRLQDKVRET I BALRMAG I KV 
W VLTGD KHETAVS VS LS CGHFHRTMN I LEL I NQ KS DSECAEQLR 
QLARR I TEDHVIQHGL VVDGTSLSIiALREHEKLFMEVCRNCSAV 
LCCRMAPLQKAKVIRL 1 KI S PEKP 1 TLAVGDGAND VSM 1QE AHV 
GIGIMGKEGRQAARNSDYAIARFKFLSKLLFVHGHFYYIRIATL 
VQYFFYKNVCFITPQFIiYQFYCLFSQQTLYDSVYLTLY\NICPT 
SL P I L I YSLLEQHVDPHVIiQNKPTLYRD ISKNRLLS I KTFI*YWT 
ILGFSHAFIFFFGSYLLIGKDTSLLGWGQMFGNWTFGTLVFTVM 
VITVTVXMALETHFT^WINHLVTWGSIIFYFVFSLFYGGILWPF 
LGSQNM YFVF IQLL SSGS AWFA I ILM WTCLFLD 1 1 KKVFDRHL 
HPTSTEKAQLTETNAG I KCLDSM CC FPEGEAACAS VGRMLE R VI 
GRCS PTHISRS WSASDPF YTNDRS I LTLSTMDSSTC 


£008 


4554 


1089 


A3VRRAGARRGPGRALPAGATAVPPPSARRRRRCPAPEHAG PAR 

ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 

DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 

FSSKFFSA YKS H FRNVHS ED FENR 3 LLNCPYCTFNADKKTLETH 

I KI FHAPNASAP SS SIiSTFKDKNKNDGLKPKQADS VE QAVY YCK 

XCTYRDPLYE XVRKHI YREHFQHVAAP Y IAKAGE KS LNGAVPLG 

SJIAREESSIHCKRCI.FMPKSYEALVQHVIEDHERIGYQVTAMIG 

HTNWVPRSKPLML I APKPQDKKSMGLPPRIGS LAS GNV\RS I*P 

SQQMVNRLS I PKPNLNSTGVNMMSS VHLQQNN YG VKS VGQGYS V 

GQSMRLGLGGNAPVSIPQQSQSVKQLLPSGNGRSYGLGSEQRSQ 

APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 

ATGPPPGNTSSTQKWKICTICNEJuFPENVYSVHFEKEHKAEKVP 

AVANYI MK IHNFTS KCLYCNR YLPTDTLLNHML IHG LSCPYCRS 

TFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDLTLQQGSHTNIH 

LLVTTYNLRDAPAESVAYHAQNNPPVPPKPQPKVQEKADIPVKS 

SPQAAVPYKXDVGKTLCPLCFS I LKGPISDALAHHIgRERHQVIQ 

TVHP VEKKliT YKCI HCLGVYTSNMTAST ITLHLVHCRGVGKTQN 

GQDKTNAPSREiNQSPSLAPVKRTYEQMEFPLLKKRKLDDDSDSP 

SFFEEKPEEPWLALDPKGH\EDDSYEARKSFLTKYFT\KQPYP 

TRREIEKLAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 

LGFNMKELNKVKHEMDFDAEGLFENHDEKTS RVNAS KTAD KKLN 

LGKEDDSSSDSFENLEE3SNBSGSPFDPVFEVEPKISNDNPEEH 

VLKVI PEDAS ESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 

EVDQDDWEWKDGASPSESGPGSQQVS0FEDNTCEMKPGTWSDE 

S S QSEDARS S KPAAKKKATMQGDREQLKWKNSS YGKVEG FWS KD 

QSQWXNASENDERLSNPQIEWQNSTrDSEDGEQFDNMTDGVAEP 

MHGSLAGVKLSSQQA 


6009 


4272 


1534 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC " 
HLVLG VLVPVARQSSHSAGPAQS AFR * TGTG S GTPKAAEQS GYW 
EAYTLG HQHWNM FPIQRPPLVM KGRRIMGGKCEKG * VSDS VTGG 
RAVAGEQASQRRTVFTAGGGECItfAKSVRASVFTGNQPGVMGLL 
NGKRGGCFESG YLFGFI VTGKI QSLEAKVPIiPVNGQTGERAS PG 
NCRIHIVDAVC*SEHH*DHFIAAAFLENSTIIS*VAPGSWQDHA 
VLQKE VQAS VRCRGFES VDTAPAGFWAHS P PGLQGEPTTTS VSL 
FVLAPQDGEGVPFVEGQLVTVLGIjVVPQS IRHTFVHHTQIiFLHP 
I * KLGALDVAFLHLLTLVCSSFNVAYG*GKNGGTTLHQLFAEVN 
AVTRGS AVQRRPS IT 1 SS I HVDTKIQQELHDVMVAGADGWQWG 
DPF WGLAGI FHL I DDPLHQ IELS FQRRV* EQCQGVKPDSQPVP 
RPLRVGIiliQVGPLVRGGGRRVAGRGKRCWRDLIjFPWRWGLSHRT 
RDLLRGGDRGHVVVIVLCRLGSLVGGLGTDELIiWFGGR* hll IG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=»Aspartic Acid, E«» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
Il=IIistidine, I»Isoleucine, KsLysine, 
L=Leucine, MoMethionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YnTyrosine, X»Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








I**RGRLSGEWGCGLGRGELFQVSIGIGVSIVHIGQGDHBVIjGG 
AGLVERGALHATGQGVEALVQQLLDVGPAGALGIjCDGAALFQGP 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQYFLKGG*RLWCARGQ*PVKKRQRRWRG*TR 
R *NGLTIHCFN* L I *GAVCCRLVILRWCGLLBVHGVYGT* IHCL 
GSFPGRLWP+ PPISQERPNGHCQVJE FRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL* GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LPPFQGACRPRTQRCRTWVCPIAWRQLLAYTRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AGI SQNAKTGDLPAFGECVGIAS KALCGLTEAAAQAAYLVGIFD 
PNS Q AGHQG LVD P IQFARANQAIQMACQNLVDPGSS PSQVLS AA 
TI VAKHTSALCNACRIASS KTANPVAKRHFVQSAKEVANSTANL 
VKTI KALDGDFS EDNRNKCRI ATAPL I EAVENLTAFASNPEF VS 
I PAQ I S SEGSQAQB P I LVSAKPMLE S S S YI*IRTARSLAINPKDP 
PTWSVLAGHSHTVSDSI KSLITS I RDKAPGQRECDYS IDG INRC 
IRDIBQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFEPL I LAAVGVAS KILDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDDIMVTIiNKAASEVGLVGGMVDAIAEAMSKliDEGTPPEPKG 
TFVDYQTTWKYS KAXAVTAQEMMTKS VTOPB ELGGLASQMTS D 
YGHLAFQGQMAAATAE P E E I G FQ I RTR VQDLGHGC I FLVQKAG \ 
ALQVCPTDS YTKRELIE CARAVTEKVSLVLSALQAGNKGTQACI 
TAATAVSGIIADLDTTIMFATAQTLNAENSETFADHRENILKTA 
KALVEDTKLLVSGAAST PD KLAQAAQSSAATI TQLAE WKLGAA 
SLGSDDPETQWLINAI KD VAKALS DL. I S ATKGAAS KPVDDPSM 
YQLKCyVAPCV^^VTKVTSLLiCTVKAVEDEATRGTRALEATIECIKQ 
ELTVFQS KDVPEKTS S PEES IRMTKG ITMATAKAVAASNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TKTYLDL LEHVLV I Z^Q KPT P ELKQQ LAAFS KRVAGAVTEL I QAA 
EAMKGTEWVDPBDPTVIAETELLGAAAS IEAAAKKLEQLKPRAK 
P KQADETLDFEEQ I LEAAKS I AAATSALVKSAS AAQRELVAQGK 
VGS I PANAADDGQWSQGLI SAARMVAAATSSLCEAANASVQGHA 
SEEKLI SS AKQVAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDD VWKTKFVGG I AQI I AAQEEM 
LKKERELBEARKKLAQIRQQQYKFLPTBLREDEG 


6011 


446 


1835 


LLQPAMRKS PGLS DCLWAWI LL LSTLTGR S YGQP£ LQDELKDNT " 

TVFTR I LDRLLDG YDNRLRPGLGERVTE VKTDI FVTS FGPVSDH 

DMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTF 

FHNGKKSVAHNMTMPNKLLRITSDGTIJJYTMRiTVR\AECPMAF 

GRDFPM\D\AHACPLKFQSYAYTRAEWYEWTREPARSWVAED 

GSRLNQYDLLGQTVDSGIVQS5TGEYVVMTTHFHLKRKIGYFVI 

QTYLPCIMTVILSQVSFWLNRESVPARTVFGVTTVLTMTTLSIS 

ARNSL PKVAYATAMDW FI AVCYAFVFSAL I EFATVNYFTKRG YA 

WDGKSWPEKPKKVKDPLI KKNNTYAPTATSYTPNLARGDPGLA 

TIAKSATIEPKEVKPETKPPEPKKTFNSVSK1DRLSRIAFPLLF 

GIFNLVYWATYLNREPQLKAPTPHQ 


6012 


351 


5013 


PAELFQSFAIWHKBLYDWRLGPWNQCQPVISKSLEKPLECIKGE 
EGIQ VREIACIQKDKD I PAE D I 1 CEYFEPKPLLEQACLIPCQQD 
CIVSEFSAWSECSKTCGSGLQHRTRHWAPPQFX3GSGCPNLTBF 
QVCQS SPCEAE ELRYS LHVG P WS TCSMPHS RQVRQARRRGKNKE 
REKDRSKGVKDPEARELIKKKRNRNRQNRQENKYWDIQIGYQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEW3EWS 
PCSKTCHDMVSPAGTRVRTRTIRQFPIGSEKECPBFEEKEPCLS 
QGDG WP CATYGWRTTE WTECR VDPLLSQQDKRRGNQTALCGGG 
I QTRE VYCVQANENLLS QLSTH KNKEAS KPMDLKLCTG P I PNTT 
QLCHIPCPTECEVSPWSAWGPCTYENCNDQQGKKGFKLRKRRIT 
NEPTGGSGVTGNCPHLL2AIPCEEPACYDWKAVRLGDCEPDNGK 
ECG PGTQVQEWCINS DGEBVDRQLCRDAI FPI P VACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARSILAYAGEEGGIRCP 
NSS ALQEVRS CNEHPCTVYHWQTGPWGQCI EDTS VSS FNTTTTW 
NGEASCSVGMQTRKVI CVRVNVGQVGPKKCPESLRPETVRPCLIi 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G«Glycine, 
H«Histidine, I-Isoleucine, K=Lysine, 
I»=Leucine, M=Methionine, N*Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
Soserine, T=Threonine, v«valine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








PCKKDCI VTP YSDWTSCPS \SCKEGDSS IRKQSRHRVI IQLPAN 

GGRDCTDPLYEBKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 

VQQDSP\GAQEGCGPGRQARAITCRKQDGGQAGIHECLQYAGPV 

PALTQACQI PCQDDCQLTSWSKFSS CNGDCGAVRTRKRTI*VGKS 

KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 

VLLGMKVQGDIKECX3QGYRYQAMACYDQNGRLVETSRCNSHGYI 

EEACIIPCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLREKPYN 

GGRPCPKLDHVNQAQVYEVVPCHSDCNQYLWVTBPWS I CKVTF V 

NM RENCGEGVQTRKVRCMQNTADG PS EHVEDYLCDPE EMPLGSR 

VCKLPCPEDCVISEWGPWTQCVLPOJQSSFRQRSADPIRQPADE 

GRSCPNAVBKEPCNLNKNCYHYDYNVTDWSTCQLSEKAVCGNGI ' 

KTRMLIXrVRSDGKSVDLKYCEALGLEKNWQMNTSCMVECPVNCQ 

LSDWSPWSECSQTCGL7GKMIRRRTVTQPFQGDGRPCPSLMDQS 

KPC P VKPCYRWQYGQ WSP CQ VQEAQCGEGTRTRNIS CWSDGS A 

DDFSKWDBEFCADIBLI I DGNKNM VLEE S CS QPCPGDCYLKDW 

SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQEliENQHIiCPE^IL 

ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCliVM 

SQPDADRSCNPPCSQPHSYCSBTKTCHCEEGYTBVMSSNSTLEQ 

CTLI P VWL P TMEDKRGDVKTS RAVH PTQPS SNPAGRGRTW FLQ 

PFG PDGRLJCTWVYGVAAGAFVLLI F I VSM I Y LACKKP KKPQRRQ 

NNRLKPLTLAYDGDADM 


~5013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVI.KVLWLTASQPC 
S I VDVEFLPVYHPSPEESRDPTLYANNVQRVMAQADGI PATECB 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRKLCGRWGR 
ARPESNDQPGRVCQAATAL 


5014 


2657 


613 


EAVAGGMEKSRMWLPKGPDTIjCFDKDEFMKEDFDVDHFVSDCRK ~ 
RVQLEELRDDLELYYKLLKTAMVBLIMKDYADF\VNLSTMLVGM 
DKALNQLSVPLGQLREBVLSLRSSVSESIRAVDERMSKQEDIRK 
KKMCVLRLIQVIRSVEKIEFQLNSQSSKETSALEASSPLLTGQI 
LERIATEFNQLQFHACQSK\GMPLLDJCVRPRIAGITAMLQQSLE 
GLLLEG LQTSDVD 1 1 RHCLRTYAT I DKTRDAEALVGQVLVKP Y I 
DEVI IEQFVESHPNGLQVMYNKLLEFVPHHCRLLREVTGGAISS 
EKGNTVPGYDFLVNS VWPQ I VQGLE EKLPS LFNPGNPDAFHEKY 
TISMDFVRRLERQCGSQASVKRLRAHPAYHSFNKKWNLPVYFQI 
RFREI AGSLEAALTDVLEDAPAES PYCLLASHRTWSSLRECWSD 
EMFLPLLVHRLWRLHSGR F WAR YS VFV\N\ E LS LRP ISNES P KB 
I K KPL VTGS KEPS I TQGNTEDQGSG PS ETKP WS ISRTQLVYW 
ADLDKLQEQLPELLEI IKPKLEMIGFKNFSS IS AALEDSQSS FS 
ACVPSLSSKI IQDLSDSCFGFLKSALEVPRLYRRTNKEVPTTAS 
SYVDSALKPLFQLQSGHKDKLKQAI IQQWLEGTLSESTHKYYET 
VSDVLNS VKKMEES LKRLKQARKTT PANPVGPSGGMSDDDKI RL 
QLALDVEYLGEQ IQKLGLQASDI KS FSALAELVAAAKDQATAEQ 
P 


6015 
> 


13 


2237 


AEG CAERRGT E PWELSMS WE SGAGPGLGSQGMDLVWSAW YGKC 
VKGKGSIiPLSAHG I VVAWLS RAEWDQVTVYLPCDDHKLQRYALN 
RITVVIRSRSGNELPLAVASTADLIRCKLIjDVTGGLGTDBLRLLY 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNI PDWI VDLRHELT 
HKKMPH I^CRKGC YFVLDWLQKTYWCRQLENSLRETWELEEFR 
EGIEEEDQEEDKNIWDDI TEQKPBPQDDGKSTESDVKADGDS K 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAIKAWNNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQIiAAI^IEYEENVDIiNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVAIJTKTGRNARRF 
S ACQ WEARRG WRLFNCSAS LDWPRM VES CIjGS PC WAS PQLLR 1 1 
F\KAMGQGLQDE\EQEKLIiRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDGVEEEEENDDQEEEEEDEDDEDDBEBDRMEVGPFSTG 
QESPTAENARLLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVI FWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSS£NFRGAFLLEARGSLH\GL\KTGLQIjF 


6016. 


13 


2237 


ASGCAERRGTEPWELSMSWESGAGPGIiGSQGMDLVWSAWYGKC 
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SEQ 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaicgcd cud 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D^Aspartic Acid, E« 
Glutamic Acid, Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K«Lysine, 
L=Leucine, M^Methionine, N«Asparagine, 
P-Proline, QsGlutamine, R»Arginine, 
S-Serine, T=»Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=:Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPIiSAHGIVVAWLSRAEWDQVTVYLFCDDHKLQRYAIjN 
R ITVWRS RS G NEL P LAVAS TADLI RCKLLDVTOGLGTDELRLL Y 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNIPDW IVDLRHBLT 
HKKMPEINDCRRGCYPVLDWLQKTYWCRQLENSLRETWELEEFR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYBRARELLVSYEEEQFTVLEKFRYfcP 
KAIKAWNNPS PRVECVfcAELKGVTCENREAVLDAFLDDGFLVPT 
FEQIJ^Q I EYEENVDL^VL VP KPFSQFWQPXiLRGLHSQNTFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDVJPRMVESCIiGS PCWASPQLLRI I 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVOEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEBNDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\ S CGVGS \ GNCSNS SS SNFRGAFLLEARG S LH \GL\ KTGLQL F 


6017 


203 


3469 


SHQE I EQNS AMAPRKRGGRGIS F I FCCFRNNDHP E I T YRLRNDS 
NFALQ1MEPALPMPPVEELDVMFSELVDELDLTDKHREAMFALP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDOLNSMAARKSLLAL 
E KEEEBERSKTI ESLKTALRTKPMRFVTR F 1 DLDGLSCX LNFLK 
TMDYETSESRIHTSLIGCIKALMNNSQGRAHVLAHSESINVIAQ 
SLSTENIKTKVAVLEILGAVCLVPGGHKKVIiQAMLHYQKYASER 
TR FQTLINDLDXSTGRYRDE VS LKTAIMSFINAVLSQGAGVESL 
D FRLHLRYE \ FLMLGIHPVMDKLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHIDTKSATQKFELTRKRLTHSEAYPHFMSILH 
HCLQMP YKRSGNTVQ YWLLLDR 1 1 QQI VIQNDKGQDPDS TPLEN 
FN I KNVVRMLVNENE VKQWKEQAE KMRKEHNELQQKLEKKEREC 
DAKTQEKEEMMQTLNKMKEKLEKETTEHKQVlCQQVADLTAQLiHE 
LSRRAVCAS I PGGPSPGAPGGPFPSS VPGSLLPPPPPPPLPGGM 
LPP P PPPLPPGGPPPPPGPPPIiGAIMPPPGAPMGLALKKKS I PQ 
PTNALKS FNWS KLPENKLEGTVWTE I DDTKVFKIIiDLEDLERTF 
SAYQRQQDFFVNSNSKQKEADAIDDTLSSKLKVKELSVIDGRRA 
QNCNILLSRLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFVPE 
KSD1DLLEEHKHELDRMAKADRFLFEMSRINHYQQRLQSLYFKK 
KFAER VAE VXP KVEAIRSGS EEVFRS GALKQLLEVVLAFGNYMN 
KGQRGNAYGFK I SS LNKI ADTKSS I DKN I TLLHYL ITI VENKYP 
SVLNLNEEIJU)IPQAAiCVNMTELDKEISTLRSGLKAVETELEYQ 
KSQPPQPGDKFVSWSQFITVASFSFSDVEDLLAEAKDLFTKAV 
KHFGEEAGKIQPDEFFGIFDQFLQAVSEAKQENENMRKKKEEEE 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKR ITNQMTDSSRERP I TKLNF 


6018 
6019 


13 
2 


2510 

1066 


T ISQSGG I RRR REAVW FE WNMDFS RLHM YS PPQCVPENTG YTY 

ALSSSYSSDALDFETEHKLDPVFDSPRMSRR5LRLATTACTLGD 

GEAVGADSGTSSAVSLKNRAARTTKQRRSTNKSAFSINHVSRQV 

TSSGVSYGGTVSLQDAVTRRPPVLDESWIREQTTVDHFWGLDDD 

GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNCNMLS2RKDVLT 

AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 

AO^YFLLQILRRIGAVGQAVSRTAWSAIiWLAVVAPGKAASGVF 

WWI^IGWYQFVTLISWLNVFLLTRCLRNICKFLVLLIPLFLLLG 

LSLRGQG\NFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 

PLCySDSEAFPWHWMSGV^QVASLSGQCHHHGENLRELTTLLQK 

LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 

HLEDILGKLREKSEAI QKELEQTKQKTI S AVGEQLLPTVEHLQL 

ELDQLKSELSSMPJIVKTGCETVDAVQERVDVQVREMVKLLFSED 

QQGGSLEQLLQRFSSQFVSKGDIiQTMLRDLQLQIIiRNVTHHVSV 

TKQLPTS EAWSAVSEAGASGI TEAQARAI VNSALKLYSQDKTG 

MVDFALESGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 

WI Q PD I YPGNCWAFKGSQG YLWRLSMMIHPAAFTLEH I PKTL 

S PTGNIS SAP KDFAVYGIjKNE YQKEGQLLGQFTYDQDGBS LQMF 

QALKRPDDTAFQ I VE1»R I FSNWGHPE YTCLYRFRVHGEP VK 

TPNDREPPPQRPPSSRRASHLAQBITSAASLGPQTQILGSLTTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


r£ecnctea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acia segment containing signal peptide " 
(A^Alanine, C=Cysteine, D»Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine , G=*Glycine, 
H-Histidine, I»Isoleucine, K-Lysine, 1 
L-Leucine, M-Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T« Threonine, VaValine, I 
W^Tryptophan, YoTyrosine, X=Unknown, *«Stop | 
Codon, /^possible nucleotide deletion, 1 
\=possible nucleotide insertion) j 








PVITSJVIRSMPGISSQILTNAQGQVIGTLPWVVKSASVA^PAn 
QS WVQAVTPQLLLNAQGQVI ATLASS PLP PP VAVRK\ PSTPES 
LLKSBVQPIKPTPTVPQPAWIASPAPAAKPSASAPXPITCSBT 
PTVSQLVSKPHTPSLDEDGINLEEIRBFAKNFKIRRLSLGLTQT 
QVGQALTATEGPAYSQSAICRFEKLDITPKSAQKLKPVLEKWLN 
EAELRNQEGQQNLMEFVGGE PS KKRKRRTS PTPQ AI EALNAYFE 

KNPLPTGQBITEIAKELNYDREWRVWFCNRRQTLKNTSKLNVF 
QIP 


6020 


4953 


549 


EAIQFBVS IGNYGNKFDTTCKPIiASTTQYSRAVFDGNYYYYLPW 
AHTKPWTLTSYWED ISHRLDAVNTLLAMAERLQTNIEALKSG I 
O^KIPANQIAELWLKLIDEVIEDTRYTIiPLTRGKANVrVLDTQI 
RKLRSRSLSQIHEAAVRMRSBATDVKSTIAEIEDWLDKLMQLTE 
EPQNSKPDIIIWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQE KNNGPKVPVELRVNI WLGLSAVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKI KLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAAS PSBLTCPPGWEWEDDAWS YDINR 
AVDEKGWE YGI TI PPDHKPKS WVAAEKMYHTHRRRRLVR KRKKD 
LTQTASSTAOAMEEIiQDQEGWBYASLIGWKFHWKQRSSDTFRRR 
RWRR KMAP S ETHGAAA I FKLEGALG ADT TE DGDE KS LE KQ KHS A 

TTVFGANTPIVSOCFDRDYIYHLRCYVYQARNUALDKDSFSDP 
YAH I C FLHRSKTTE I XHS TLNPTWDQTI I FDE VE I YGBPQTVLQ 
NPPJCVIMELPDNDQVGKDEFLGRSIFSPVVKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELILRGKDGSNLP ILPPQRAPNLYMVP 
QG I R P WQLTA I E ILAWGLRNMKNFQMAS I TSPS LWBCGGERV 
E3 WI KNLKKTPNFPSS VLFMKVFliPKEEL YMP PLVI KVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
DI VI EMEDTKPLLASKCLSSMSTALSKMAS PATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVABFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSPRIYPLPDDPSVPAPP 
RQFREL PDS VPQECTVRI Y IVRGLELQ PQDNNGLCDPY I KI TLG 
KKVIE \ DRDHY I PNTLNPVFGRMYELSCYLPQEKDLKIS VYDYD 
TFTRDEKVGBTIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERIiALHILRTQGLVPEHVETRTlflSTFQP 
NIS\RYYLRVIIWNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRS LDGEGNFNWRFVFPFDYLPAEQLCI VAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC / RGLDMI PDLKAMNPLKAKTAS L FEQKSMKGWW 
PCYAE KDGAR VMAGKVEMTLEXlJffEKEADERPAGKGRDEPIWNP 
KLDLPNRPETSFLWFTNPCKTMKFI VVfRRFKWVI IGLLFLLI LL 
LFVAVLLY3LPNYLSMKIVKPNV | 


6021 


4953 


549 

J 
) 


EAIQFEVSlGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYIiPW 1 
AHTKP WTLTSYWEDI SHRLDAVNTLLAMAERLQTNIEALXSG I 
QGKI PANQLAELWLKL XDEVI EDTRYTLPLTEGKANVTVLDTQ I 
RKLRSRS LSQIHEAAVRMRSEATDVKSTLAE IEDWLDKLMQLTE 
EPQNSMPDI I IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
G KTQT I FLKYPQE KIJNGPKVPVE LRVNI WLGLSAVE KKFNS FAE 
GTFTVFAEMYEKQALMFGKWGTS GLVGRHKFS DVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFrDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAAS PS ELTCPPGWE'rfEDDAWS YD INR 
AVDEKGWEYGITI PPDHKPKS WVAAEKMYHTHRRRRLVE jrovien 

LTOTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAIFKLBGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTP I VS CNFDRDYI YHLRCYVYQARNLLALDKDSFSDP 
YAHI CFLHRSKTTB I IHSTLNPTWDQTI IFDEVEI YGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRSIFSPVVKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELILRGKDGSNLP ILPPQRAPNLYMVP 
2GIRPWQLTAIE ILAWGLRNMKNFQMAS ITSPSLWECGGERV 
BSWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 



429 



WO 01/53312 



PCT/US00/34263 



1 SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, B» 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, Nx»Asparagine, 
P«Froline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine # V«Valine, 
W«Tryptophan, Y-Tyroeine, X=Unknown, *=3top 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








" DIVIEMEDTKPLLASKCLSSMSTAIiSKMASPATVHLTEKEEElV 
DWWSKFYASSGEHEKCX3QYIQKGYSKLK1YNCELENVAE7EGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQPRELPDSVPQECTVRIYrVRGLELQPQDNNGLCDPYIKITLG 
XKVI E \DRDHYI PNTLNP VFGRMYELS CYLPQE KDIiKISVYD YD 
TFTRDEKVGETI IDIiENPF\l*SRFG\SHCG\ I PEEYCVSGVNTW 
RDSUR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSIiDE 
FEANKILHQHIiGAPEERIiALH I IiRTQGLVPEHVETRTLHSTPQP 
NIS\RYYLRVHWKTKDVILDBKSITGEEMSDIYVKGWIPGNEB 
NKQKTDWYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTtiTCRHTI 
HFLQKS PGGNC/RGLDMIPDLKAMNPLKAKTAS LFEQKS MKGWW 
PCYAEKDGARVMAGKVEMTLEILMEKEADERPAGKGRDEPNMNP 
KLDLPNRPETS FIiWPTNPCKTMKFI VWRRFKWV1 IGLLFLLI LL 
LFVAVLLYSLPNYLSMKZVKPKV 


6022 


4*53 


549 


EAIQFEVSlGNYGNKFDTTCKPLASTTQYSRAVFbG^YYVYLPVi " 

AHTKPVVTLTSYWEDISHRLDAVNTLIAMAERLO/TNIEALKSGI 

O^KIPANQIJ^LWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 

RKLRSRSLSQIHEAAVRMRSBATDVKSTLAEI EDWLDKLMQLTE 

EPQNSMPDI I IWMIRGEKRLAYARIPAHQVIiYSTSGENASGKYC 

GKTQTI FLKYPQE KNNGP KVP VELRVNI WLGLSAVEKKFNS FAE 

GTFTVFAEM YENQALMFGKWGTSGLVGRHKPSDVTGKI KLKREF 

FItPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNBSRYP 

GGDWKPAEDTYTDANGDKAASPSELTCP PGWEWEDDAWS YDINR 

AVDEKGWEYGI TI PPDHKPKS WVAAEKMYHTHRRRRLVRKRKKD 

LTQTASS TAGAMEE LQDQEG W EYAS L I G WK FHW KQ R S S DTFRRR 

RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKSLEKQKHSA 

TTVFGANTPIVSCWFDRDYIYHLRCYVYQARNLLAtiDKDSFSDP 

YAHI CFLHRSKTTEI IKS TLNPTWDQT 1 1 FDEVEI YGEPQTVLQ 

NPPKVlMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKIiLW 

HPVMNGDKACGDVLVTAEIiILRGKDGSNLPILPPQRAPNLYMVP 

QGIRP WQLTA1 E I LAWGLRNMKNFQMAS ITSPSLWECGGERV 

ES WI KITLKKTPNFPSSVLFMKVFLPKEELYMP PLVI KVIDHRQ 

FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 

DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLT3KEBEIV 

DWMSKFYASSGBHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQFRBLPDSVPQECTVRIYIVRGLEIjQPQDNNGLCDPYIKITIiG 

kkvie\drdhyipntlnpvfgrmyelscylpqekdlkisvydyd 
tftrdekvgetiidlenpf\lsrfg\shcg\ipeeycvsgwtw 

RDSLR \ PTQ\LLQNVARFKGFPQP ILSBDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
N IS \R Y YLRVI I WNTKDVILDEKS ITGEEMSDI YVKG WI PGNEE 

nkqktdvhyrsldgegnfnwrfvfpfdylpaeqlcivakkehfw 
sidqtefrippr\liiqiw\dndkfs\lddylgfprtltcrhti 
hflqks pggnc/rgldmipdlkamnplkaktasiifeqksmkgww 
pcyaekdgarwagkvenftleilnekeaderpagkgrdepnmnp 
kldlpnrpets flw ftnpcktmkfivwrr fkwvt igllfllill 

LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


916 


SQELGMFVELNNLLNrrPDRAEQGKLTLLCDAKTDGSFLVHHFL ~ 

S FYLKANCKVCFVALI QS FSHYS I VGQKLGVSLTMARBRGQLVF 

LEGL/IVCSGR\VFQAQKBPHPLQFLREANAGNIjKPLFEFVREA 

LKP VDSGEARWTYP VLLVDDLS VLLSLGMGAVAVLD P IHYCRAT 

VCWELKGNMVVLVHDSGDAEDEENDILLNGLSHQSHLILRAEGL 

ATGFCRDVHGQLRILWRRPSQPAVHRDQSFTYQYKIQDKSVSPP 

AKGMSPAVL 


6024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQLNELELLMEKSFWEEAE 
L PAELFQ KKWAS FPRTVLSTGMDNRYLVLAVNTVQNKEGNCE K 
RLVITASQSLENKELC IIiRNDWCSVPVEPGDl IHLEGDCTSDTW 
IIDKDFGYLILYPDMLISGTSIASSIRCMRRAVLSETFRSSDPA 
TRQMIiIGTTVIiHEVFQKAINNSFAPEKLQEIiAFOTIQBIRHLKEM 
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Glutamic Acid, F^Phenylalanine, G=Glycine, 
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Wieucine, M=Methionine, N=»Asparagine, 
P=*Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine , V*Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


" 6025 


* 




YRLNI^QDEIKQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLS'iH 
PSDNSKDNSTCNIEWXPMDIEESIWSPRFGLKGK1DVTVGVKI 
HRGYKTKYKIMPLELKTGKESNSIEHRSQWLYTLLSQERRADP 
EAGLLLYLKTGQM YP VPANHLDKRELLKLRNQMAFS LPHRI S KS 
ATRQKTQLASLPQ I I EEEKTCKYCSQIGNCALYSRAVEQQMDCS 
S VP I VMLPKI BEETQHLKQTHLE YFS LWCLMLTLE SQS KDNKKN 
HQNIWLMPASEMEKSGSCIGNLIRMEHVKIVCDGQYLHNFOCKH 
GAIPVTNLMAGDRVIVSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLSVLPESTLFRLDQEEKNCDIDTPIjGNLSKIiMENTFVSKKLR 
DLI I DFRE PQ F I S YLSS VLPHDAKDT VACI LKGLNKPQRQAMKK 
VLLS KDYTL I VGMPGTGKTTT I CTLVRILYACGFS VLLTS YTHS 
AVDNI LLKLAKFKIG PLRS R\Q IQKVHPAIQQFTEHE I CRS KS I 
KS\LALLEELYTSQL1DATTCMGINHPIFSRKIFDFCIVDEASQ 
ISQPI CLGPL FFSRR F VLVGDHQQIiPPLVLNREARALGMS ESLF 
KRLEQNKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 
NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 
KVPAPEQVEKGGVSNVTEAKIiIVFLTSIFVICAGCSPSDIGIIAP 
YRQQLKI INDIiLARSIGMVBVNTVDKYQU\RDKS I VLVSFVRSN 
KDGTVGELLKDWRRT •NVAITRAKHKL1 LLGCVPSLNCYPPLEKL 
LNHLNSEKLIIDLPSREHESLCHILGDFQRE 1 


6026 " 


3977 


89 


GGFPAQSDHIjPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA" { 
ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 
GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTEWP 
AACGATLP ALGLRSSAQDPQAVLGALGRAIiSPLEE WLRLHTYLA 
GEAPTLADLAAVTALLLPFRYVLDPPARRIWNim^WFVTCVRQ 
PEFRAV^GEWLYSGARPLSHQPGPEAPALPKTAAQLKKEAKKR 
EKLEKFQQKQKI QQQQP PPGEKKPKPEXREKRDPGVI TYDL PTP 
PGEKRDVSGPM PDS YS PRYVEAAWYPW WEQQGFFKPE YGRPNVS 
AANPRGVF^CIPPPNVTGSLHLGHALTNAIQDSLTRWHRMRGE 
TTLWNPGCDHAG1ATQVWEKKLWREQGLSRHQLGREAFLQEVW 
KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 
LHEEG 1 1 YRSTRLVNWS CTLNS A I SD I E VDKKELTGRTUjS V?G 
YKEKVEFGVLVS FAYKVQGSDSDEE VWATTR I ETMLGDVAVAV 
HPKDTRYQHLKGKNVIHPFLSRSLPIVFDEFVDMDFGK3AVKIT 
PAHDGNDYEVGQRHGLEAISIMDSRGAI.INVPPPFLGLPRFEAR 
KAVLVALKERGLFRGIEDNPMWPLCNRSKDWEPLLRPQWYVR 
CGEMAQAAS AAVTRGDLRI IiPERHQRTWHAWMDN I RE \ WCMFPG 
KLWWG \ HR \ I PAYFVrVSDPAVPPGEDPDQRYWVSGRNEAEARE 
KAAKEFGVS PDKISLQQDBDVLDTWFSSGliFPLSILGWPNQSED 
LSVFYPGTLLETGHDILFFWVARMVMLGLKLTGRLPFREVYLH^ 
I VRDAHGRKMSKS LGNVIDPLDVI YG2 SLQGLHNQLLNS NIDPS 
EVEKAKEGQ KADFPAGI PECGTDALRFG LCAYMS QGRDI NLDVN 
R ILGYRHFCNKLWNATKFALRGLGKGF VPS PTSQPGGHESLVDR 
WIRSRLTEAVRLSNQGFQAYDFPAVTTAQYSFWLYELCDVYLEC 
LKPVLNGVDQVAAECARQTLYTCLDVGIiRLLSPFMPFVTEELFQ 
RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALELALSITRA 
VRP \LRADYNLHPESGPTCFIiEVAD\EATGALASAVSG YVQG PG 
QAQVWAVAEPWGLPAP\QGCAVAtASDRCSI\HLQLQG\LLDP 

arelg\klq\akrveaq\rqaq\rlr\erra\asgnpvkvpl\e 

VQEADEAKLQQTEAELRKVDEAIALFQKML 1 




2674 


514 


GPITFLKXKAiUvjKDMPLRIHVLLGIAITTLVQAVDKKVDCPRLCj 
itniRfWE x jf k» jl i riiZJUa TVDCK DLiGLJjT r P ARLPANTQ I f^T^Q 1 

TWNIAKIEYSTDFPVNLlXSLDLSQNNLSSVTMINGKKMPQLIiSV 
YLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGLHN 
LLRLHLNSNRLQMINSKWFDALPNLEILMIGENPIIRIKDMNFK 
PLINLRSLVIAG 1NLTE I PDNALVGLENLES IS FYDNRLI KVPH 
VALQ KWNLKFljDtjNKNP INR I RRGDPSNM LHLKELG I NNM PEL 
ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESLML 
^SNAIiSALYHGTlESLPNLKE I S IHSNP I RCDCVIRWMNMNKTN 
IRFMEPDSLFCVDPPE FQGQNVRQVHFRDMME I CLPL IAPESFP 
3NLNVEAGSYVSFHCRATA\EPQPEI YWITPSGQKLLPNT\l*TD 
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W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


r 






KFYVHSEGTLDINGVTPKEGGLYTCIATNLVGADLKSVMIKVDG 
SPPQDNNGSLNIKIRDIOANSVIiVSWKASSKILKSSVKWTAFVK 
TBNSHAAQSARIPSDVKVYNLTHLNPSTEYKICIDIPTIYQKNR 
XKCVNVTTKGliHPDQKEYEKNN'rTTLMAC LGGLLGI IGVI CL I S 
CLSPBMNCDGGHS Y VRN YLQ KPTFALGE L YFPLINL WEAGKE KS 
TSLKVKATVIGLPTNMS 


I 6027 
I 6028 


5254 


! 4148 


GGRRAPGRPGRS I KDEEEBTVFREWS FSPDPLPVRYYDKDTTK 
P I SFYLS S IiSELIiAWKPRLEDGFNVALBPLACRQPPLS SQRPRT 
LLCHEMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TI PPVG WTNTAHRHGVCVLGTFITEWNEGGRLCHAFIAGDBRS Y 

qavadrlvqitXrffrfdgwlimiensi^slaavgnmppflrylt 

TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 

ftnynwreehlermlgqagerradvyvgvdvfargnwggrfdt 

DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
DPVALRNRCPAPAKLCPH 




120 


3432 


NCLIiLQAKGFHGEI EDLQQWLTDTERHLLAS KPLGGLPETAKEQ 
LNVHMEVCAAFEAKEETYKSLMQKGCMMtiARCPKSAETNXDQDI 
W^KEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
t/LTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQI I 
ELDKTGTHLKYFSQKQDWLIKNLLISVQSRMEKWQIiLVERGR 
SLDDARKRAKQFHEAWSKLMEWLBESEKSIjDSEIiEIANDPDKIK 
TQLAQHKE FQKS IX3AKHS VYDTTNRTGRS L KEKTSLADDNLKLD 
DMLS2LRDKWDTICGKSVERQNKI*EEA\LLFSGQFTDALQALID 
WLYRVEPQIAEDQPVHGPIDLVMNL I DNKKAFQKELGKRTSSVQ 
ALKRS ARELI EGSRDDS S WVKVQMQELSTRWETVCALS I SKQTR 
LEAALRQAEEFHSVVHALLEWLAEAEC3TLRFHGVLPDDEDALRT 
LI DQHKE FMKKLEEKRAELNKATTMGDT VLAI CHPDS I TTI KHW 
I TI I RARFEE VIAWAKQHQQRLASALAGLIAKQEI^EALIjAWLQ 
WAETTLTDKDKEVI PQE I EEVKALI AEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHI PVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRWLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 
NFDFDIWRKKYMRWMNKKKSRVMDFFRRIDKDQDGKITRQEFID 
GIIjSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 
YKP1 TDADKI EDEVTRQVAKCKC AKRFQVEQ IGDNKYRFFLGNQ 

fgdsqqlrlvrilrstvmvrvgggwmaldeflvkndpcrakgrt 

NMELREKFILADGASQGMAAFRPRGRRSRPSSRGASPJNRSTSVS 

sqaaqaaspqvpatttpkilhpltrxjygkpwltnskmstpckaa 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQFADSKKTPS RPGS RAGS KAGSRASSRRGSDASDFD I S E 
IQS VCSD VETVPQTHRPTPRAGSRP S TAKPS KI PTPQRKS PAS K 
LDKSSKR 


1 6029 


1 


3533 

• 

; 

i 


IMPCGSSRLLRGCWTHPNEPV^ULS y FDCIES VMENS KVLGE5 M 
AGISQNAXTGDLPAFGECVG I ASKALCGLTEAAAQAAYLVGI FD 
PNSQAGHQGLVDP IQFARANQAI QMACQNLVDPGS S PSQVLS AA 
T I VAKHTS ALCNACR I AS S KTANPVAKRHF VQS AKE VANSTANL 
VKT I KALDGDFSEDNRNKCR I ATAPLI EAVENLTAFASNPEF VS 
I PAQ1 SSEGSQAQEPILVSAKPMLES SS YLIRTARSLAINPKDP 
PTWSVLAGHSHTVSDSIKSLITSIRDFCAPGQRECDYSIDGINRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFEPL I LAAVG VAS KI LDHQQQM 
T VLDQTKTLAESALQMLYAAKEGGGN P KAQHTHDAI TEAAQLMK 
E AVDD I M VTLMEAAS EVGL VGGMVDA I AEAMS KLDE GTP PE PKG 
xrvLrxuiivv KAI AVTAQEMMTKS VTNPEELGGLASQMTSD 
YGHLAFQGQMAAATAEPEEIGFQIRTRVQDLGHGCI FLVQKAG\ 
^QVCPTOS YTKRBLI ECARAVTEKVSLVLSALQAGNKG!TQACI 
rAATAVSGI IADLDTTI MFATAGTLWABNSETFADHRENILKTA 
^VEDTKLLVSGAASTPDKLAQAAOSSAATITQIAEVVKLG^ 
SLGSDDPETQWLIKAI KDVAKALSDLI S ATKGAAS KPVDDPSM 
if QLKGAAKVMVTNVTSLLKT VKAVEDEATRGTRALEATIEC I KQ 
SLTVFQSKDVPEKTSS PERSIRMTKGITMATAKAVAAGNSCRQE 
^VIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F*» Phenyl alanine, G«Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=*Asparagine, 
P=*Proline, Q=Glutamine, R^Arginine, 
SsSerine, ^Threonine, VsValine, 
W=Tryptophan, YoTyrosine, x=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








tlgyldllehvlvilqkptpelkqqlaafskrvagavteliqaX 
eamkgtewvdpedptvxaetellgaaas ieaaakkleqlkprak 
pkqade tldfebqi leaaksiaaats alvksasaaqrelvaqgk 
vgs i panaaddgqwsqg l i saarm vaaats s lceaanasvqgha 
see kli ss akq vaas taqllvac kv kadqds eamrrlq aagnav 
krasdnlvraaqkaafg kaddddwvktkfvggi aqi iaaqeem 
lkk3releearkklaqirqqqykflptelredeg 


6030 


3 


1777 


FPGRGSPALQLE VL I CLGLMGLERALNVLAP I FYRN I VNLLTBN 
APWNSIAWTVTSYVFUCFLQGGGTGSTGFVSNIiRTFLWIRVQQF 
TS RRVELLX FSHLH ELS LR WHLGRRTGE VLR X ADRGTSS VTGLL 
S YLVFNVI PTLADI 1 1 G 1 1 YFSMF FNAWFGLI VFLCMSLYLTLT 
IWTEWRTKFRRAMNTQENATRARAVDSLLNFETVKYYNAESYB 
VKRYREAIIKYQGLEWKSSASLVLLNQTQNLVIGLGLLAGSLLC 
AY FVTEQKLQVGD YVL FGT YI IQLYMPLNWFGTYYRMIQTNFID 
MENMFDLLKK\ETEVKDLPGAGP FRFQKGRIEFENVHFS YADGR 
BTLQD VSFTVMPGQTLALVGPSGAGKST I LRLLFRFYD I S S GCI 
RIDGQDISQVTQALFRFSHWELCPKDTVLFNDTIADNIRYGRVT 
AGND E VEAAAQAAG IHDAX MAP PBGYRTQVGERGLKLSGGEKQR 
VAIARTILKAPGI ILLDEATSALDTSNERAIQASLAKVCANRTT 
I WAHRLSTWNADQILVI KDGCI VERGRHEALLSRGGVYADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 


1694 


LRMSENLDKSNVNEAGKSKSNDSEEGLEDAVEGADEALQKAIKS 
DSSSPQRVQRPHSSPPRFVTVEELLETARGVTNMALAHEIVVNG 
DFQ I KP VELP ENSLKKRVKE I VHKAFWDCLS VQLS EDP PAYDHA 
IKLVGEIKBTLLSFLLPGHTRLRNQITBVLDLDIjI KQEAENGAIi 
DI S KLAEFI IGMMGTLCAPARDEE VKKLKD I KE X VPLFRE I KS V 
LDLMKVDMANFAISSIRPHLMQQS VEYERKKFQEI LERQPNS ld 
FVTQWLEEASEDLMTQKYKHAIiPVGGMAAGSGDMPRLSPVAVQN 
YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQLTILGAV 
LLVTFSMAAPGISSQADFAEKLKMIVKILLTDMHLPSFHLKDVL 
TTIGEKVCLEVSS CLSLCGS S PFTTDKETVLKGQI QAVAS PDDP 
IRRIMESRILTFLETYLASGHQKPI/PTVPGGLSPVQRELBEVAI 
KFARLVNYNKMVFCP YYDAI LS KILVRS 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTiSE 
5 CDR I KEEFQFLQAQYHS L KLECEKLASE KTEMQRHYVMY YEMS 
YGLN I EMHKQAE I VKRLNAI CAQVI P FLS QEHQQQ WQAVERAK 
QVTMAELNAI XGQQQLQAQHLS HGHGLPVPLTPHP S GLQ P PAI P 
PIGS S AGLLALSS ALGGQSHLPI KDE KKHHDNDHQRDRDS I KSS 
SVSPSASFRGAEKHRNSADYSSESKKQKTEEKEIAARYDSDGEK 
SDDNL WD VSNEDPS S PRGS PAHS P R ENGLDKTRLLKKDA P I S P 
AS IASSS STPS 5KS KELSLNE KSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSIiRTPMAVPCPYPTPFGIVPHAG 
MNGELTSPGAAYAGLHNISPQMSAAAAAAAAAAAYGRSPWGFD 
PHHHMRVPAI P PNLTG 1 PGGKPAYSFHVS ADGQMQP VP FPPDAL 
IGPG I PRHARQINTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQU)CLNRDNYIRSCRLLPDGRTLIVGGEASTIi 
S IWDLAAPTPR I KAE LTSSAPACYALA I S PDSKVCFS CCSDGNI 
AVWDLHNQTLVRQFO^HTDGASCIDISNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYQI^LHBSC\^LKFAHC»KWF\VSTGKDNIJLNA 
W\RTPYG\ASIF\QSKBSSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYBVIY 


6033 


39 


241* 


AARLCRAQPTKSAWMIRDIiSKMYPQTRHPAPHQPAQPFKFTlSE " 
SCDRI KEEFQFLQAQYHS LKLECEKLASEKTEMQRHYVMYYENS 
YG LNI EMHKQAE I VKRLNAI CAQ VI P FLSQEHQQQ WQAVERAK 
QVTMAELNAI IGQQQLQAQHIiSHGHGLPVPLTPHPSGLQPPAI P 
P I GS S AGLUUj SSALGGQSHLP I KDEKKHHDNDHQRDRDS IKS S 
S VS PSAS FRGAEKHRNSADYS S E S KKQKTEEKE IAAR YDS DGE K 
SDDNLWDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP 
AS I A5SESTPSSKSKELSLNEKSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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SEQ 
ID 

NO: 


"1 D^^j'A^J — 
Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F-Phenylal amine, QoQ lycine, 
H«Histidine, I-Isoleucine, K=Lysine, 
L»Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«=Threonine, V=Valine, 
WaTryptophan, YsTyrosine, X»Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTSPGAAYAGLHNISPQMSAAAAAAAAAAAYGRSPWGFD 
PHHIIMRVPAIPPNLTGIPGGKPAYSFHVSAD3QMQPVPFPPDAL 
IGPGIPRHARQINTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
DI SH PGNKS PVS QLDCDNRDN Y IRSCRLLPDGRTL I VGGE ASTL 
SIWDLAAPrPRIKAELTSSAPACYALAXSPDSKVCPSCCSDGNI 
AVOT3LHNQTLVRQFQGHTDGASCIDIS1TOGTKLWTGGLDNTVRS 
W \ DLREGRQLQQHD / PFTS PVFSLGYCP \ TEEWLAVGMENSN\ V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 

W\RTPYG\ASIF\QSK3SSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYBVIY 


6034 


2663 


714 


ESGRRRRLKRRRSPCPGTAGGPGETNPGPGACPRGPREEAAAAM 
E I APQEAPP VPGADGDI E EAPAEAGSPS PASPPADGRLKAAAKR 
VTFPSDEDI VSGAVE P KD PWRHAQNVTVDE VIGAYKQACQ KLNC 
RQIPKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKVVDLEQ TNLDEDGASALFDM I E Y YESATHLN I SFNKH I GT 
RGWrQAAAHMMRKTSCLQYL\DARNTPLLDHSAPFVARAL,RIRSS 
LA VLHLEN AS LSGR P LMLLATALXMNTCNIiR E Li YL \ ADNKLNG LQ 
DSAQLGNLLKFNCSLQILDLRNWHVLDSGLAYICEGLKEQRKGL 
VTL\VLWNNQLTHTGMAFLGOTLPHTQSLETLNLGHNPIGNEGV 
RHLKNGL I SNRS VLRLGLASTKLTCEGAVAVAEF I AES PRLLRL 
DLRENEI ICI^LMALSLAL KVNHSLLRXDLDREPKKEA^ E 
TQKALLAE IQNGCKRNLVLARBREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEBBEGERDET 
PSGAI DTRDTGSSEPQPPPBPPRSGPPLPNGLKPEFALALPPE P 
PPGPEVKGGSCGUSHELSCSKNEKELEELIiLEASOESGOETL 


6035 


19 


404 


SVTYLGIILHKNTGJ\LPADPVQLISQTPTPSTKQQIjLSFLGMVG 
YFYLWIPGFAILTKPLCKI»TKENI*ADAIDPKSFSHSSFRSLKTA 
LEMASTLALPDSSQPF\SLHTABVQGCWEILTQGLGPLPV 


6036 


1745 


3*6 - 


L PDVEFOLGRRRCaRKMDS VEKGAATS VS NPRGRP^RGRPPKIjGJ^ 
SRGGO^RGVEKPPHUUUiIIJyiGGSKGIPIiKNIKHIiAGVPLlGW 
VLRAALDS GA FQ S VWVSTDHDE I ENVAKQ FGAQVHRRS SEVSKD 
S STSLDAI IEFLNYHNEVD I VGNI QATS PCLHPTDLQKVAEMI R 
EEGYDS VFSWRRHQFRWSEI QKGVREVTEPLNLNPAKRPRRQD 
WDGELYENGS FYFAKRHL I EMGYLQGGKMAY YEMRAEHS VD IDV 
D I D WP I AEQRVLRYGYFGKEKLKE I KLLVCNIDGCLTNGH1 YVS 
GDQKEIISYDVKDAIGISLLKKSGIBVRLISERACSKQTLSSLK 

ldckmevsvsdklawdewrkemglcwkevaylgnevsdeeclk 
rvglsgapadacstaqkavgyickcnggrga\irefaehic\ll 
mekglinfmpknrnlavnigekk 


6037 


2936 


1919 


wtswwmssvltillfslqgnkmlnysapsaggyllprkpvgtpa 

GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQKQPGGGQVNSSRYKT\ELCRPFEENGACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 
HNAEE RRALAGARDLSADRPRLQHS FS FAGFPS AAATAAATGLL 
DSPTSITPPPILSADDLLGSPTLPDGTNNPF\AFSSQEIaASLFA 
PSMOLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSLSDQEGYLS 
SSSSSHSGSDSPTLDNSRRLPIFSRLSISDD 


6038 


1450 


426 


SSALQEFGTRNHTFGVPLPHRRKQI I SCNICQLRFNSDSOAAAH 
YKGTKHAKKLKALEAMKNKQKSVTAKDSAKTTFTS ITTNT INTS 
SDKTDGTAGTPAISTTTTVBIRKSSVMTTEITSKVBKSPTTATG 

TMtiEARNGSGT I KAFPRAG VKG KGPVNKGNTGLQNKT FHCB ICD 
VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYWKLQKTAHPI, 
GVKIjWSKEPSKPIAPRILPNPLAAAAAAAAVAVSSPFSLRTAP 
AATLFQTS ALPPAUbR PAPGP IRTAHT P VLFAP Y 


6039 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEBKAAKITELINKL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSBBPITETASP 
RKTEDS FYNNS YNPFKB VQTPQYIiNPFDEPEAFVT I KDS PPQST 
KKKNIRPVDMSKYLYADSSXTEBBELDESNPFYEPKSTPPPNNL 
VNPVQELETERRVKRKAPAPPVliSPKTGVLNENTVSAGKDLSTS 
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SEQ 
ID 
NO: 



Predicted 
! beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end" 

nucleotide 

location 

corresponding 

to first 

amino acid 

residue of 
I amino acid 
! sequence 



6040 



475 



1052 



3886 



Wno acid segment containing signal peptide 
(^Alanine, C-Cyeteine, D-Aspartic Acid! B- 
r »- a ^. ACid ' F=Phen y^lanin e/ G=Glycine, 
H-Histidine, I=Isoleucine. K-Lyslne, 
^Leucine, Methionine, N=Asparaaine, 
P=Proline, Q=Glutamine, R=Arginine, 
SeSerine, TaThreonine, V-Valine, 
^Tryptophan, Y=Tyrosine, X«Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 



6042 



253 



.„^ vlllJ . «wy^iMJVTi^iuiv rArjiKUVKiTNFTTSW 

RNGIiSFCAILHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SSKSTYKVGNYETDTNSSVDQBKPYAELSDLKREPELQQPISGA 
VDFLS QDDS VPVNDSG VGESES EHQTPDDHLS PSTAS P YCRRTK 

SDTEPQKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKKRLLKA 

LENSRSLE CRSDPESP I KKTSLS PTSKI/3YSYSRDLDLAXKKHA 
ShKQT ESD PDADRTTLNHADHS SKI VQHR LLSRQEELKERAR VL 
LEQARRDAALKAGNKHWTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQIj I AEARSGGKMS ELPS YGERAAEKLKBRS KASGDENDNIE I 
DTNE E I PEG FWGGGDE LTNLENDL DTPEQNS KL VDLKL KKLLE 
VQ PQVANS P S S AAQKAVTES SEQDMKSGTEDLRTERLQ KTTERF 
RKP WFSKDST VRKTQLQS PSQ Y I ENRPEMKRQRS 1 QEDTKKGN 
EE KAA I TETQRKPSEDEVLNKGFKDS \ SQYWGELAAIjENEQKQ 
IDTRAALVEKRLRYLMDTGRNTEEEEAMKQEWFMLVNKKNALIR 
R^QLSIiLEKEHDLERRYEIjIjNRELRAMLAI edwqkteaqkrre 
QLLLDEZiVALVNKRDALVRDI *T>AQEKQAEEEDEHLRRTLEQNKG 

kmakkee kc vlq 

^•^l TA PSCA|.'PVQFRQPSVS GL3QITKSLYISNGVAA^KLM 
LSSNQITMVINVSVEWNTLYEDIQYMQVPVADSPNSRLCDFFD 
PIADHIHSVEMKWR\TLLHCMGVSRSAALCLAYLMKYHAMSL 
LDAHTWTKSCRPIIRPNSGFWEQLIHYEFQLFGKNTVHMVSSPV 
GMIPDIYEKEV RLMIPL 

TEKUEKTAHNL EiM V L1H # WERLSE I C V AK1 iaKPKAU VKS VLGVS 
NLIC VLQKPKGSLKSSKKKNGKVRPADE ILESNKEKEKCVS SEG 
EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 
VNER KS EQHLRPLS TLEiDSPS S SR VFKMItLGDEKQS I VQAKPLE 
I AKL VQKNPAVQFIjYQKL IGWLNEDQRKDFGFLVD I LYSALRCC 
DNDMERKKVLDDLTKVDLKWNSLLKIIEKACPSSDKKALVTPWL 
KGDILGEKLVNLADCLCNEDLESRVSSESHFSERWTU5LVLSQ 
HVKND YL1GDVYVERI I VRLHETkPKTKKIjSEAESSDS S VS FI C 
DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 
CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 
DI NSLQ VLLS AVDDL LNTLL E S EDS YLMG VYIGS VM PNDS E WE K 

MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHLCT 
SALLSKMVLIALRKETVLENNELEKIIAELLYSLQWCEELDNPP 
IFLIGFCEILQKKNITYDNLRVLGNMSGLLQLLFNRSREHGTLW 

S LI IAKIi I LSRS IS sdevkphykrkesffpltegnlhtiqs lc p 

FLS KEEKKEFS AQC I PALLGWTKKDLCS TNGGFGHLAI FNSCLQ 
TKSIDDGELLHGILKIIISWKKEHED1FLFSCNLSEASPEVLGV 
NIEIIRFLSLFLKYCSSPLAESEWOFIMCSMIAWLETTSENQAL 
YSIPLVQLFACVSCDLACDLSAFFDSTTLDTIGNIiPVNtjISEWK 
EFFS QGIHS LLLP ILVT VTGENKDVSETS FQNAMLKPMCETLTY 
I SKEQLLSHKLPARIiVADQKTNLPEYLQTLLNTLAPLLLFRARP 
VQIAVYHML YKLMPELPQ YDQDNLKS YGDEEEE PALS P PAALMS 
IiLSIQEDLLENVLGCIPVGOIVTIKPLSEDFCYVLGYLLTWKLI 
LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPBNPTYAE 
TAVE VPNKDP KTFFTEELQLS IRETTMLP YHIPHLACS VYHMTL 

KDLPAMVRLWMNSSEKRVFNIVDRFTSKYVSSVLSFQE1SSVQT 
STQLFNGMTVKARATTREVMATYTIEDIVIBLIIQLPSNYPLGS 
I X VESG KRVG VAVQQWRNWMLQLSTYIjTHQNGS I MEGLAL WKNN 

VDKRFEGVEDCMICFSVIHGFNYSLPKKACRTCKKKFHSAVCLY 
KWFTSSNKSTCSLCRETFF X 
MAELAPASPSu I KAS VSNGDTTLLCSRRQSCGMNJbl ^kO^SLTYP 
GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
GAQRAPGGLSYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 
LVDKNIDRFIP I TKLKY YFAVDTMYVGRXLGLLFFP YLHQDWEV 
QYQQDTP VAPRFD VKAPDLYI PAMAFI TYVLVAGLALGTQDRFS 
PDLLGI^ASSAiiAMLTLEVIAILLSLYLVTVNTDIiTTIDLVAFL 
GYKYVGMIGGVLMGLLFGKIGYYLVLGWCCVAI FVFMIRTLRLK 
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SEQ 
ID 
NO: 


Predicted 

beginning* 

nucleotide 

location 

cor r c sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H«Histidine, I=»Isoleucine» K=Lvsine 
L« Leucine, M=Methionine, NcAsparagine, 
P»Proline, QsGlutamine, R=Arginine, 
S=3erine, T=Threonine, V-Valine, 
W=Tryptophan, YoTyrosine, X -Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6043 


403 


599 


ILADAAAEGVP VRGARNQLRM YLTMAVAAAQ PMLM YWLTFHLVR 

LCLFFFFPCATPVLPLPSHSAt,/CLSril.SVSSWFCP(^QPEiL"pT~ 

PLPPLQNKTAKGSLSTEQSBRG 


6044 


793 


4li 


KLEM^FTLISKVKISREVTMIASKFGIGKJQVRHSLLGYLGVVV 
DIDP VYS LS E PS PI)ELAV1C>ELRAAPW YWVMEDDNGLPVHTYL 
AEAQIiSSELQDBHP\EQPSMDELAQTIRKQLQAPRLRN 


j 6045 


155 


2299 


SPLPQVAAMNYLRRRLSDSNFMANLPNGYMTDLQRPQPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFSEQVGGGSGGAGRGGAASRVLLVIDEPHT 
DWAKYFKGKX I HGB IDI KVEQAEFSDLNLVAHANGGFS VDME VL 
RNGVKWRSLKPDFVIiIRQHAFSMARNGDYRSLVIGLQYAGlPS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPIilDQTFYPNHK 
BMLSS \TTYPVWKMGHGTLWGWGKVKVDNQHDFQD I AS WALT 

KT YATAE PFT DAKYDVP VOTf T ftfWVira vmd «pet r cr*KHaxr*mimrt n 7i 

MLEQIAMSDRYKLWVDTCSEIFGGLDICAVEALHGKDGRDHIIE 
WGSSMPLlGDHQDEDKQLIVELWNKMAQALPRQRQRnASPGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRO^PPI^QRPPPQGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
ASQAAPPTQGQGRQSR PVAGGPGAP PAARPPAS PS PQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 

AG PVP^TG PPTTYWPWD QfSXVSD&r'DDtrertr.li nv nc/%rarn n nikfr* 
nwrTr^iurr i ivw**^irS>\j*^jlri*wKir RJfc'UlaAQKPSQDVP P PATA 

AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKS FASLFSD 


6046 


212 


1075 


egltgpcervpfllgrgpphgatraghrraVrwagpeslpplpr 
sl i mds pragthqgp ldaetevgadrctstayqeqrpqveqvgk 

QAPLSPGLPAMGGPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCA 
PTVALRARRGADLSSLRALLGQALPHQ\AQLGQLSYLAPGEDGH 
WVPI PE3ESLQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAO^PEDLGFRQGDTVDVLCBVDQAWLEGHCDGRIGI fpkcf 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


PVLVTSLRMREAD^tifepPQtjMEVSADI ISTVEFNHTdELLATGD " 
KGGRWIFOREPESKNAPHSOnPVnWQTPrtcuiPDwiTnvT vct e» 

IEBKINKIKWLPQQNAAHS LLSTNDKTIKLWKI TERDKRPEGYN 
LKDEEG KLKDLS TVTS LQ V P VLK P MDLMVEVS PRRI FANGHTYH 
INS I S VNS DCETYMSADDLRINLWHLAITDRS FTP \NI VDIKPA 
NMEDLTE VI TASE FHPHHCNLFVYSSS KGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL\NMEARPIETYQVHT)xIjRSKLCSLYENDCI FDKFECA 
WNGSDS V I MTGA\ YNNFFRMFDRNTKRDVTL\EASRESS KPRAV 
LKPRRVCVGGKRRRDDI S VDSLDFTKKILHTAWHPAENI 1 AIAA 
TNNLY I FQDKVNS DMH 


6046 


1 


3194 


GI RTPKFCDS PTS DLEMRNGRGRGKRMR PNSNTPVNETATASDS 
KGTSNSS KTRAGANS KGRRGSQNSS EHRPPASSTS EDVKAS PSS 
ANKR KN K P LSDME LNS SS ED S KGS KRVRTNSMG S ATG PL PGTKV 
E PT VLDRNCPS P VLI DCPH PNCNKKYKH I NGLKYHQAHAHTDDD 
S KPEADGDS EYGEEP I LHADLGSCNG \ AS VSQK \GS LS PARS AT 
P KVRLVE PHSPS PSS KFS TKGLCKKKLSGEGDTDLGALSNDGSD 
DGPSVNDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 
SARPI/APLAIPPQQIYTFQTATFTAASPGSSSGLTATVAQAMP 
NS PQLKP IQ PKPTVMGEPFTVNPALT P AKDKKKKDKKKKES S KE 
LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGLLNGSSDPHQSR 
LASIKAEADKI YSFTDNAPSPS IGGSSRLENTTPTQPLTPLHW 
TQNGAEASSVK^SPAYSDISDAGEDGEGKVDSVKSKDAEQLVK 
EGAKKTLFPPQPQSKDSPYYQGFESYYSPSYAQSSPGALNPSSQ 
AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 
IQQRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHLIjSTNTAYRQQ 
YEEQQKRQSLeQQQRGVDKJGAEMGLKEREAALKEEWKQKPSIPP 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVI I PKLDDSSKLPGQ 
A P EG LKVKLS DASHLS KEAS EAKTGAECGRQAEMDP I LWYRQBA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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SEQ 

1U 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
icaiaue or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino aexd segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F* Phenylalanine, G-Glycine, 
H-Histidine, I=I©oleucine, K»Lysine, 
LoLeucine, M=Methionine , N^Asparagine , 
P=Proline, Q«Glutamine, RaArginine, 
S=Serine, T=Threonine, V=Valine, 
N=Tryptophan, Y«=Tyrosine, X=Unknown # *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY 

IPYMHGYSYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSySFSP 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 

SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 

SPSQRIiMSTHHHHHHIiGYSLLPAQYNLPYAAGLSSTAIVTVSOOG 
STPSLYPPPRR 


5049 


215 


1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQESPTIiPESSAf 
DSD YYS P TGGAPHG YCS PTS AS YG\KALNP YQYQ YHG VNGSAGS 
YPAKAYADYSYASSYHQYGGAYNRVPSATNQPEKEVTEPEVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 
GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACKSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSNQSPASSYLENSASWYTS 
AASSINSHLPPPGSLQHPLALASGTLY 


6051 


566 


1718 


KGLERTCCAMEBSDSEKTTEKENLGPRMDPPLGEPG\GSIiGWVL "' 
PNTAMKKKVLLMGKSGSGKTS MRS 1 1 FANYIARDTRRLGATI LD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFIK5NLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAI LQNSPDAKI FCLVHKMDLVQEDQRDLIFKEREEDLR 
RLS RPLECS C FRTS I WDETLYKAWS S I VYQL IPNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDI FTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 




566 


1718 


KUi»ERTCCAMEESDSBKTTEKENLG?RMDPPlXaEPG\GSLGWVIi " 
PNTAMKKKVLLMGKSGSGKTSMRSI I FANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFOVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CIjEAI LQNSPDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTS IWDBTLYKAWSS IVYQLI PNVQQLEMNLRN 
FAEI I EADEVLLFERATFL VISHYQCKEQRDAHRFEKI SNI IKQ 
FKLS CS KLAAS FQSME VRNSNFAAF I DI FTSNT YVMWMSDPS I 
PSAATLINIRNARKHFBKLERVDGPKQCHjLMR 




566 


1718 


KGLERTCCJiMEESDSEKTTEKENI^PR^PPLGEPd\G^LGWVL 
PNTAWXXKVLLMGKSGSGXTSMRS 1 1 FANYIARJDTRRLGAT ILD 
R IHSLQ I NSS LSTYSLVDS VGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVESRELEKDMHY 
YQS CLEAILQNS PEAK I FCLVHKMDLVQEDQRDL I FKEREEDLR 
RLSRPLECSCFRTS I WDETLYKAWSS I VYQLI PNVQQLEMNLRN 
FAE I IEADEVLLFERATFLVI SH YQCKEQRDAHRFEKI SNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDI FTSNT YVMVVMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 j 


201 


1704 


Kl^TliMNKSkWQSRRRHGkRSHQQNPWFRLRDSEDR^DSRAAQPA " 
HDSGHGDDESPSTSSGTAGTS S VPE LPG F YFDPE KKR YFRLL PG 

HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLG FLNVTN YCHLAHELRLS CMERKKVQIRSMDP S ALAS D 
RFNLI LADTNSDRLFTVNDVTVGGSKYGI INLQSLKTPTLKVFM 
HENLYFTNRKV\NS VCWASLNHLDSH I LLCLMGLAE TPGCATLL 
PASLFVNSHPAGIDRPG\MLCS FRI PGAW S CAWSLN I QANNCFS 
TGLSRRVLLTNVVTGHRQSFGTNSDVI^FAI^PLLFNGCRS 
G3 IFAIDLRCGNQGKGWKATRL FHDSAVTS VRILQDEQYLMASD 
MAGKIKLWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGILVAVG 
QDC YTRI WSLHDARLLRTI PS P YPASKAD I PS VAFS SRLGGSRG 
APGLLMAVGQDLYCYS YS 


6054 


1 


1051 

i 
J 


P?IARLQEFGTSRRHPlAAPSGVHLLVRilGSHRIFSSP]^HIYXH 
KQS S SQQRRN FFFRRQ RDI SHS I VLPAAVS SAHPVPKH I KKPDY 
VTTGIVPDWGDSIEVKNEDQIQGLHQACQLARHVLLLAGKSLKV 
DMTTEEIDALVHREI ISHNAYPS P LGYGGF PKS VCTS VNNVLCH 
SIPDSRPLQDGDIINIDVTVYYNGYHGDTSETFLVGNVDECX5KK 
LVEVARRCRDEAIAACRAGAPFS VIGNTI SHITHQNGFQVCPHF 
/GHG IGS YFHGHPE I WHHANDSDLPMEEGMAFT IEPI ITEGS PB 
?K\^EDAWTVVSLD/TSKVSAQFEHTVL1TSRGAQILTKLPHEA 
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SEQ 
ID 
NO: 


preaxccea 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A=Alanine, C=Cysteine, D**A3partic Acid, E«- 
Glutamic Acid, F=Phenylalanine, GoGlycine, 
iUHistidine, I«Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q^Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=;Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


" 6055 


421 


23^4 


p p yflls flaw wlygqs drtetd i sqs agp ppgtiiqcsalhhd p 
gcancsrfcrdcsppacqchthvfpgnalngvqppbls rtlal i 
s s rep prkkkksqtetg kerbrtsfltqggkrfelqhgiiag i cm 
tllitgds i vs aeavwdhvtmanreiiafkagdvikvldasnkfl w 
wwgqiddeegw fpas f vrlwvnhedeveegps dvqnghldp1jsd 
clclgrplqnrdqmranvineimsterhyikhlkdicegylkqc 
rkrrdmfsdeqlkvifgniediyrfqmgfvrdlekqynnddphi* 

S E I GPCFLEHQDG FWI YS E YCNNHLDACMELS KLMKDSRYQH FF 
EACRLLQQMIDIA\IDGFLLTPVQKICKYPLQLAELLKYTAQDH 
SDYR YVAAALAVMRNVTQQINERKRRLENIDK IAQWQAS VLDWE 
GEDILDRSSELIYTGEMAWIYQP\YGRNQQRVFFLFDHQMVLCK 
KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMT VRKVPKQKGVNS ARS VP PS YPPPQDPLKHGQ YLVP 
\DGIAQ3QVFEFTEPKRSQSPFWQNFSRLTPFKK 


6056 


43 1 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLA YMESKGAHRAGIAKVI P PKEWKPRQCYDD IDNLLI PAP I QQ 
MVTGQSGLFTQ YNI QKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ERKYWKNLTFVAP I YGAD INGS I YDEGVDEWNIARLNTVLDWE 
EE CG 1 5 X EG VNTP YL YFGMWKTTFAWHTEDMDL YS INYLHFGEP 
KSWYAI?PEHGKRLERIJWFFPSSS0^3CDAFLRHKMTLISPSV 
LKKYGIPFDKITQEAGEFMITFPYGYHAGFNHGFNCAESTNFAT 
VRWIDYGKVAKLCTCRKDMVXISMDIFVRKFQPDRYQLWKQGKD 
IYTIDHTKPTPASTPEVKAWLQRRRKVRKASRSFQCARSTSKRP 
KADEEEE VSDE VDGAE VPNPDS VTDDLKVSEKSEAAVKLRNTEA 
SSEEE S S ASRMQVEQNLSDHI KLSGNSCLSTS VTEDI KTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEXSDPSELSWPKSPESCS 
SVAESNGVLTEGEESDVESHGNGLEPGEI PAVPSGERNS FKVPS 
IAEGENKTSKSWRHPIjSRPPARSPMTLVKQQAPSDEELPBVLSI 
EEEVEETESWAKPLIHLWQTKPPNFAAEQEYNATVARMKPHCAI 

ctllmpyhkpdssneendarwetkldewtsegktkplipemcf 

I YSEENI EYSPPNAFLEEDGTSLIiISCAKCCVRVHASCYGI PSH 

eicdgwlcarckrnawtaecclcnlrggalkqtknnkwahvmca 

VAVPB VRFTNVPE RTQ ID VGR I PLQRLKLKCI FCRHRVKRVSGA 

ciqcsygrcpasfhvtcahaagvl\mepddwpywnitcfrhkv 

NPNVKS KACEKV I S VGQTV I TiQIRNTR Y YSCRVMAVTSQTFYE V 
MFDDGS FSRDTF PED I VSRDCL KLGP PAEGEWQVKW PDG KLYG 
AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VSAGRCHLGTCQVNSIiSSPHVSQAQQETYLGFW INS KKSQCNI F 
LSGTY 


605-r- 


1 ' 


853 


hVAKbKEQEGEGGLGPRKEKGRARGRERRRKMQLTRCCFVFLVQ"' 
GS L YLVI CGQDDGPPGSEDPERDDHEGQPRPRVPRKRGH IS PKS 
RPMANSTLLGl^PPGEAWGIIX^PPNRPiraSPPPSAKVKKIFG 
WGDFYSNI KTVALNLLVTGKIVDHGNGTFSVHFQHNATGQGNIS 
ISLVPPSKAVEFHQEQQIFIEAKASKIFNC\RMEWEKVE\RGRR 
TS LFTHDPAKI CSRDHAQSSATWSCSQPFKWCVY I AF YSTD YR 
LVQKVCPDYNYHSDTPYYPSG 


6058 


1 


986 


HPLPSASLGLPSVSLGVSLCVRSALIiEAWPMIiPKRRRARVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSS EATHWMEBTSAEEAVS WQERRMAAAP PGCfTPPAI*LD 
I SWLTES LGAGOPVPVECRH15 T.WinDQVnDT CDSUnrorvn rv\n 

PTPLTHHNTGLSEALEriAEAAGFEGSEGRLLTFCRAASVLKAL 
PS P VTTLS QLQGLPHFGEHSSRWQELLEHGVCEEVERVRRSB / 
RIiFTQI FGVGVKTADRWYREGLRTLDDLREQPQKLTQQQKAGBP 
SREAGPWASLNCTIiDPSASXP 


6059 - 


2 


3650 


qqdfssladltdhrahrcpgdgdddpqlswvasspsskdvaspt" 
qmigdgcdlglgeeeggtglpypcqfcdksfirlsylkrheqih 

SDKLPFKCTYCSRLFKHKRSRDRHIKLHTGDKKYHCHECEAAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHWQAHKKNK 
EHIiAJCSEKEAJUCDDFMCDYCEDTFSQTEELEiCHVIjTRHPQIjSEK 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of ■ 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Alanine, C=Cysteine, D=Aspartic Acid, E*» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, Ialsoleucine, K«Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
PoProline, Q»Glutamine, R^Arginine, 
S-Serine, T=Threonine , V«Valine, 
W»Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVF VDENTLLAH I HQAHAN^Kk XCPMCPE \QFS S V " ™ 

\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 

ERGSTPDSTLKPLRGQKKMRDDGQGWTKVVYSCPYCSKRDFNSL 

AVLEIHLKTIHADKPQQSHTC^ICLDSMPTLYNLNEHVRKIiHKN 

HAYPVMQFGNISAFHCNYCPEMFADINSLQEHIRVSHCGPNANP 

SDGNNAFFCNQCSKGFLTESSLTEH1Q\Q\AHCSVGSAKLESPV 

VQPTQSFMEVYSCPYCTNSPI FGS ILKLTKHIKENHKNI PLAHS 

KKSKAEQSPVSSDVEVSSPKRQRLSASANSISNGEYPCNQCDLK 

FSNFESFQTHLXLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 

YOTTSTHYVCESCDKQFSSVDD\LQKH\LIiI>MPHPI»CCTHCT\L 

C^EVFDS\KVSl\QVHl4AVT<HSNEKKMYRCTACNWDFRKEADIiQ 

VHVKHSHLGNPAKAHKClFOGETFSTEVELQCHirrHSKKYNCK 

FCSKAFHAI I LLE KHLREKHC VFDAATENGTANGVPPMATKKAE 

PADIiQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 

VLLQNHRLRDHNIRPGEDDGSRKKAE FIKGSHXCNVCSRTFFSE 

NGLREHLQTHRGPAKHYMCPICGRRFPSLLTLTEHKVTHSKSLD 

TGTCRICKMPLQSEEEFI EHC3QMHPDLRNSLTGFRCWCMQTVT 

STLELKIHGTFHMQKLAGSSAASSPNGQGLQKLYKCALCLKEFR 

SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGIiAPPEPADRPCA 

GLRCPECSVKFESAEDIiESHMQVDHRDLTPETSGPRKGTQTSPV 

PRKKTYQCIKCQMTFENEREIQIHVANHMIEEGINHECKLCNQM 

FDSPAKLLCHLI EHSFBGMGGTFKCPVCFTVFVQANKLQQHI FA 

VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


SYE^VGKNKLEVNHSQLKALCKCS^ 

KEPRSRGSRBRDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
DISASRPNILLLMADDLG IGDIGC YG13NTMRTPNI DRIiAEDG VK 
LTQHISAASLCTPSRAAFLTGRYPVRSGMVSS1GYRVLQWTGAS 
GGLPTNETTFAKILEEKGYATGLIGKWHIjGIjNCESASDHCHHPL 
KHG FDH F YGM P FS LMGDCAR W ELS E KR VNLEQ KLN FLFQ VLAL V 
ALTLVAGKLTHLI PVSWMPVI WS ALS AVLLLASS YFVGALI VHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRMKH3PFLLFV 
SFLHVKI PLITMENFLGKSLHGLYGDNVKEMDWMVGRILDTLDV 
EGLSNSTL1 Y FTS DHGGSLENQLGNTQYGGWNGI YKGGKGMGGW 
EGGI RVPGI FRW PG VLPAGRVI GEPTS LMDVFPT WRLAG3E VP 
QDRVIDGQDIaLPLLLGTAQHSDHEFLMHYCERFLHAARWHOJtDR 
GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKWHHDPPLLFD 
LSRDPSETHILTPASEPVFYQVMER\VQQAVWEHQRTLSPVPLQ 
LDRLGNIWRPWLQPCCGPFPLCWCLREDDPQ 


6061 


110 


1330 


MNl ri^^RKTIKNINTFE^RMLMLDGMPAVRVKTELLESEQGS PN " 
VHNYPDMFJVVPLLLNNVKGEPPEDSLSVDHFQTQTEPVDLSINK 
ARTSPTAVSSSPVSMTASAS3PSSTST3SSSSSRLASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLQ 
SNKLSB^HRI PVWQS VFAAT^TAVRS PGNVNOT1 WPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTALS IA 
RAVQEVHPS PVSRVRGNRMNNQKFPCS I SP FSIES TRRQRTVLN 
PPDSRKTAYSTDCDF\ EGLQQKIiYTKS S S PGRVHRRTHTGE KP Y 
KCTWEGCTWKFARSDELrRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


60*2 


71 


1079 


ETMAKNGP ENCEDCH I LMAEAFXS KKICKSLKICGLVFG I LALT "* 
LIVLFWGS KHFWPEVPKKAYDMEHTFYSNGEKKKI YMBIDPVTR 
TEI FRSGNGTDETLEVHDFKNGYTGI YF VGLQKCF I KTQI KVI P 
EFS EPEE E IDENEE ITTTFF EQS VI MVPAEKP I ENRDFLKNSKI 
LEICDNVTMYW\INPTL\ISGTFAKQLHHNFAFIILVSELQDFE 
BEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENG IEFDPMLDERGYCCI YCRRGNRYCRRVCE PLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


~Tu63 


71 


1079 


ETMAKNGPENCEDCHILNAEJ^KSKKICKSLKICGLVFGILALT 
LIVLFWGSKHFWPEVPKXAYDMEHTFYSNGEKKKI YMBIDPVTR 
TEI FRSGNGTDETLEVHDFKNGYTGI YFVGLQKCFI KTQI KVI P 
EFS E PEEE IDENEE ITTTF FEQS VI W VPAS K PI ENRDFL KNSK I 
LE I CDNVTMY W\ INPTL\ ISGTFAKQLHHNFAF I ILVS ELQDFE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment contain! no sion»l ^ ■ v: * jt- — 
m wwuuaimng signax peptide 

(A^Alanine, OCysteine, D-Aepartic Acid, E* 

Glutamic Acid, P- Phenylalanine, G=Glycine, 

H=Histidine, I=»Isoleucine, K^Lysine, 

L-Leucine, M=Methionine, N^Asparagine, 

P=Proline, Q=>Glutamine, R=Arginine, 

S=Serine, T=Threonine, VsValine, 

W-Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 

Codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion) 


CnCA 






EEGEDLHFPANEKKQIEQNEQWWPQVKVEKTRHARQASEBBLP 
INDYTENG I E FDPM LDE RGYCC I YCRRGNR YCRRVCK PLLGYYP 
YPYOrQGGRVICRVIMPCNWWVARMLGRV 


bUb4 


913 


311 


KLPQSLPRvTKHSPPYSLEKMTDi,VAVWDVALSDGVHKlEFEHG 
TTSGKRVVYVDGKEEIRKEWMFiaVGKBTFYVGAAKTKATINID 
AI SGPAYE YTLEINGKSLKKYMEDRSKTTNT W VLHMDGENFRI V 
I*E KDAMD VWCNGKKLETAGE FVDDGTETHFS IGTH\ACYIKAV\ 
SS G \ KRKEGI I HTl> I VDNRB I PE IAS 


6065 
6066 


1153 
6*8 


641 


MS VRVARVAW VKGLGAS Y RRGAS SFPVPPPGAQGVAELLRDATG 
AEEEAPWAATBRRMPGQCS VI»Ij F PGQGSQ WGMGRGLLNYPRVR 
BLYAAARRVLG YDLLELS LHGPQETLDRT VHOQPAI FVASLAAV 
E KLHHLQPSVI EN CVAAAG F S VG E F AAL VFAGAM E FAP a 


60*7 




3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSG^FIVTCGSDGDVRIW'" 
BDLDDDD P KFINVGEKAYSCALKSG KLVTAVSNNTI Q VHTFPEG 
VPDGI LTRFTTNANHVVFNGDGTKIAAGSSD\ FLVKI VDVMDSS 
QQKT?RGHDAPVLSLSFDPKDIFLASASCDGSVRVWQISDQTCA 
ISWPLLQKCNDVTNAKS ICRIAWQPKSGKLLAIPVEKSVKLYRR 
ES WSHQFDLSDNFISQTLNI VTWSPCGQYLAAGS INGL 1 1 VWNV 
ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 
CDPSGXTSSSKVSSRVEKDYNDLFDGDDMSNAGDFLNDNAVEIP 
SFSKGIINDDEDDEDLMMASGRPRQRSHILBDDENSVDISMLBCT 
GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRFMVWNS IGI I RCYNDE QDNA 1 DVE FHDTS I HHAT 

HL SNTLNYT I ADLSHEAI L LACE S TDELASKLHCLHFSS WD SS K 
EWI IDLPQNEDIEAICLGOGWAAAATSAIiLLRLFTIGGVQKEVF 
SLAG P VVS^GHGEQLFI VYHRGTGFDGDQCU3 VQLLELGKKKK 
QILHGD PLPLTRKS YLAW I G FSABGTP C YVDS EG I VRMtiNRGLG 
NTWTPI CNTREHCKGKSDHYWWGIHENPQQLRCI PCKGSR FPP 
TLPRPAVAILSFKIiPYCQ IATEKGOMEEOFWR *? VT pitktut nvr n 

KNGYEYEESTKNQATKEQQELLMKMLALSCKLEREFRCVELADL 
MTQNAVNLAI KYASRSRKL I LAQ KLS ELAVEKAAE LTATQ VE E E 
EEEBDFRKKLNAG YSNTATE WSQPR FRNQ VEEDAEDSGEADDE E 
KPE IHKPGQNS FSKSTNSS D VS AKSGAVT FSSQGRVNP FKVSAS 
SKEPAMSMNSARSTNILDNMGKSSKKS^ALSRTTNNEKKPT tvd 
LIPKPK^KQASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPQNTENQRPKTGFQMWLEENRSNILSDNPDFSDEADIIKEGM 
IRFRVLSTEBRKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


" 6068 


858 


321 


LPWQRLGVT,I^RGKmVTGWLESI^TAQKTAIJ^ 
PDGKEMAEEYDEKTSELLVRKWRVKSALGAMGQWQLEVGDPAPL 
GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 
SVDQKERCI I VRTTNKKYYKKFS I PDLDRHQLPLDDALLSFA\T 
PTAP 


6069 


13 


1730 ' 
1 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNLSDSGEEP 
RGEAEAPHHGTGHPES AGEHAIiEP PAPAGASAS TPP PPAPEAQL 
P P F PRELAGRS AGGS S PEGGEDSDREDGNYCPPVKRE RTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
P AVL QAPQPKALSQTVPSS GTNGVS LPADCTGAVPAAS PDTAAW 
RS P S EAADE VCALEE KE PQ KNE S S NAS E E EACE KKDPATQQA FV 
FGQNLRDRVKLlNESVDEADMENAGRPSADTPTATNYFIjQYI SS 
SLEN3TNSADASSNKFVFGQNMSERVLSPPKLNEVSSDANRENA 
AAESGSESSSQEATPEKESLAESAAAYTKATARKCLLEKVEVIT 
GEEAESNVLQMQCKLFVFDKTSQSVA^ERGRGLLRLNDMASTDDG 
TLQSR LtS DAGPRGSLR \ L I LNTKLWAQMQ I DKAS EK\ S I RITAM 
DNEDQGVKVFLISASSKDTCQVYAALHHRILALRSRVEQEQEAK 
A P APE PGAAP SNEEDDS DDDDVLAP S GATAAGAGDEGDGQTTGS 

r 




583' 


27 1 

I 

3 


PTRPGQAGS S SAMAAQRLGKRVLS KLQS PSRARGPGGS PGGLOK 
IHARVTVKYDRREL^RRLDVEKWIDGRLEELYRGMEADMPDEIN 
EDELLELESEEERSRKIQGLLKSCXSKPVEDPIQELLAKLQGIiHR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F=> Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K«Lysine, 
LoLeucine, M=»Methionine, NoAsparagine , 
PoProline, Q=Glutamine, R^Arginine, 
S»Serine, T=Threonine, VsValine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *aStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) " 








Q \ PGljRQPS PS P Vdgqpsapfqg pgartas pltllalfpg p per - 

RPALLCVLSCI 


6070 


478 


858 


IRVTVDGEFLHYIFPLQFI,DSPEW/RPTETHRGRHF\QVTLTAE 
TDCRYVSWRRKKLYLLFAQHRYISRLFSVLIGSDIADKLYALND 
RVYIGKRYKYD I RLPNF YQMSTPEI RRSPLTQHFQNSRRYN 


6071 


2 


1654 


HEARTKGNMAI*ARP \ VRLF SI*VTRI*LLAPRRGLTVRS PDE PLP V 
VRIPVALQRQLEQRQSRRRNLPRPVLVRPGPLLVSARRPELNQP 
ARLTLGRWERAP LASQG WKSRRARRDHFS I ERAQQEAPAVRKLS 
SKGSFADLGAWKPRVLHALQE\AAPEVVQ\PTTVQSSTIPSLLR 
GREIWCAAETGSGKTLSYLLPIjLQRLLG\HPSLDSLPIPAPRGL 
VLVP S R3LAQQ VRAVAQP LGRSLGLLVRDLEGGHGMRR IR LQLS 
RQPSADVLVATPGALWKALKSRLI SLEOJUSFLVLDEADTLLDES 
FLELVDYILEKSHIAEGPADLEDPFNPKAQLVLVGATFPBGVGQ 
LLNKVAS PDAVTT I TSS KLHCIMPHVKQTFLRLKGADKVAELVH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWLGYILDDHKIQHIiRIj 
QGQMPALMR VG I FQS FQKS SRDILLCTDI ASRGLDSTGVELWN 
YDFP PTLQDY I HRAGRVGRVGSEVPGT VISFVTHP WDVSL VQKI 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSXPFPLVSSSRWLVKRGiE^AYVEDT 
VLFS RRTS KQQ VYFFLFNDVLI ITXKKSEES YNVNDYSLRDQLL 
VESCDNEELNSS PGKNSSTMLYSRQSSASHLPTLT VLSNHANEK 
VEMLLGAETQSERARWI TALGHSSGKPPADRTS LTQ VE I VR S FT 
AKQPDELSLQVADWIil \ YQRVSDGWYEGER\LRDGERGWFPME 
CAKE ITCQAT I DKNVERMGRLLGLETNV 


6073 


620 


860 


PCRRGLARPLSRRPG/SILVHCAVGVSRSAlXVLAYIiMLYHHLT " 
LVEA IKKVKDHRGI I PNRGFLRQLLALDRRLRQGLEA 


6074 


16B 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA 
GGNRASRAKVILLTG YAHSSLPAELDSGACGGSSLNS EGNSGSG 
DSS S YDAPAGNS FLEDCELSRQ I GAQLKLLPMNDQI R ELQT I IR 
DKTASRGDFMFSADRLIRIjWEEGLNQIjPYKECMVTTPTGYKYE 
GVKFEKGNCGVSIMRSGEAMEQGLRDCCRSIRIGKILIQSDEET 
QRAKVYYAKFPPDIYRRKVLLMYPILQTG\NTVIEAVKVLIEHG 
VQPSVI ILLSLFSTPHGAKSI IQEFPEITILTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


P?TGQPQEVE^\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT 
KLGPE I ERAECTI RMNDAPTTG YSADVGNKTTYRVVAHS SVFRV 
LRRPQEFVNRTPBTVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFI TEKRVFS SWAQLYG I TFSHPS WT 


6076 


1721 


107 


HPS PTEAPRVQHLTMDCTWR I LFLVAAATGTHAQv'OLVQSGAE V 
KKPGASVKVSCKVSGYTLTBLSMHWVRQAPGKGLEWMGAFDPED 
GET I YAQKFQGR VTMTEDTS TDTAYMELS SURS EDTAVYYCATD 
HGDYAFDI WGQGTMVTVSSAPTKAPDVFPI 1 8GCRHPKDNSP W 
LACLITGYHPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
S QLSTPLQQWRQGE YKCWQHTASKSKKEI FRWPES PKAQASS V 
P7AQPQAEGSLAKATTAPATTRNTGRGGEEXKKEKEKEEQEERE 
TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 
AHLTWEVAGKVPTGGVE EGLLE RHSNGSQSQHS RLTL P RS LWNA 
GTSVTCTLNHPSLPPORLMALREPAAQAPVKLSLNLLASSDPPE 
A\ASWLLCEVSGFSPPNIIjLMWIjEDHGEVNT3GFAPARP1jPKP\ " 
RSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
BVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQP I FWIGLISS VCCVFAQTDENRCLKANAKS OGECIQ 
AG P NCG W CTNS T FLQEGMPTSARCDDLEALKKKG CP PDD I ENPR 
GSKD I KKNKNVTNRSKGTAEKLKPEDITQIQPQQLVLRLR5GEP 
QTFTLKFKRAEDYPIDLYYIW\DLSYSMKDDLENVKSLGTDLMN 
EMRRITSDFRIGFGSFVEKTVMPYISTTPAKLRNPCTSEQNCTS 
P FS YKNVLS LTNKGB VFNELVGKQRISGNLDS PEGGFDAIMQ VA 
VCGS L I GWR27VTR LL VFS TDAG FKFAGDGKLGG I VLPNDGQCHL 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C=Cysteine, D»Aspartic Acid, E*= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HoHistidine, I«Isoleucine, K=Lysine, 
L= Leucine, M=*Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S* Serine, T=Threonine, V«Valine, 
W=Tryptophan, YaTyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHYYDYPSIAHLVQKLSENNIQTIFAVTEEFQPVYKE 
LKNLI P KSAVGTLS ANSSNVI QLI IDAYNS LS SEVI LENGKLS E 
GVTISVQSY\CKNGVNGTGBNGRKCSNISIGDEVQFEISITSNK 
CPKKDSDSFKIRPLGFTEEVEVXliQYI CECECQSEGI PESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDBVNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNE I YSGKFCECDNFNCDRS 
NGLI CGGNG VCKCRVCECNPNYTGSACDCSLDTSTCEASNGQ I C 
NGRG I CECGVCKCTDP KFQGQTCEMC3QTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYPTYSVNGNNEVMVHVVENPECPTGPDIIPIVAGVVAG 
I VL I GLALLL I WKLLM I IHDRREFAKFE KE KMNAKWDTGENP I Y 
KS AVTTWNP KYEGK 


6078 


1426 


180 


ETEDVMELLEEDLTCPICCSIiFDDPRVLPCSHNFCKKCLEGIIiE 
GSVRNSLWRP VPFKCPTCRKKTFS YWEL I PLQVNYSLKG I VEKY 
NKI XI S FKMP VCKGH\ LGQPLNI F \ CL\ TDMQLDL/CG I C\ATR 
GEHTKHVFCS IEDAYAQERDAFESLFQSFETWRRGDALSRLDTti 
ETSKRKSLQIjLTKDSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
IAVMQAYDPEINKLNTILQEQRMAFNIABAFKDVSEPIVFLQQM 
QEFREKIKVIKETFLPPSNLPASPLMKNFDTSQWEDIKLVDVDK 
LSLPQDTGTFISKIPWSFYKLFLLILLLGLVIVFGPTMFLEWSL 
FDDLATWKGCLSNFSS YLTKIADFIEQSVFYWEQVTDGFFI FNE 
RFKNFTLWLNNVAEFVCKYKLL 


" 6079 


1586 


141 


ATARDLGCARRIDRWMESTPS RGLNRVHLQCRNLQEFLGGLSP " 
GVLDRLYGHPATCIiAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
LWVKKE FSKAQEESTGLLSGIjRI WHTQLLPGGL(X5LIIiNPI FRQ 
NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGS PSAAVSQDLAQLIjSQAGIiMKSTEPGE PPCITSAGFQFL 

lldtpaqlw y fmlqylqtaqsrgmdlveils flfqls fs tlgkd 
ysvegmsdsllnflqhlrefglvpqrkrksrryypt/ralainl 
ssgvsgaqgtvhqpgfiv\vetnyrlyayteselqialialfse 

MLYPFP\NMW\ARVTR\ESVQQAIASGITAQQI IHFLRTRAHP 
VKLKQT? VLP PTITDQ I RJLWELERDRLRFTEGVLYNQFLS Q VX>F 
ELL \ LAHAP KLGVLVFB /ntpakrlmwtpaghsdvkr FWKRQK 
HSS 


6080 


1 


1199 


IETIDHVGEFAMAAQAAivSk^RAATQGLGSNQNALKYLGQDFK 
TLRQQCLDSGVLFKDPEFPACPSALGYXDLGPGSPQXQGIIWKR 
PTELCPSPQFIVGGATRTDICQGGLGDCWLLAAIASLTLNEEIiL 
YRWPRDQDFQENYAGIFHFQPLCPPS?\FWQYGEWVEWIDDR 
IiPTKNGQIiLFLHSEQGNE FWSAIjLEKAYAKIjNGCYEALAGGSTV 
EGFEDFTGGXSEFYDLKKPPANLYQI IRKALCAGSLLGCSIDVY 
SAAEAEAI TSQ KLVKSHAYS VTG VE EVNFQGHPEKLI RLRNPWG 
EVEKSGAWSDDAPEWKHIDPRRKBELDKKVEIX3EFWMSLSDFVR 
QFSRLE I CNLSPDSLSS BEVHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 


6081 


3 


865 


EMLP LLLPLPLLWA/ GALAQDARFRLEMPBSVTVQEGIjC I FVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLSIRDARRRDNGSYFFWVARGRTKFSY 
KYSPLS VYVTALTH RPDIL I PE FLKSGHPSNLTCS VP WVCEQGT 
PPIFS WMSAAPTS LGPRTLHSSVLTI I PRPQDHGTNLI GQVT FP 
GAGVTTERTIQLS VSWKSGTVEEVVVLAVGVVAVKILLLCIiCIj I 
ILSFHKKKAVRAVEVEENVYAVMG 


£002 


283 


12Q8 


earspgptqtrtapgu\apgiaqpaalrlllsrppsaamdgik3d 
pesvgqpeeaspeeqpeea5aeeerpedqqeeeaaaaa\y\lde 
l p eplla/ lrvlaalprh e \l»vqacr \ lvclrw ke lvdg aplwl 
lkcqqeglvpeggveeerdhwqqfyflskrrrnllrnpcgeedl 
egwcdvehggdgwrveelpgdsgvefthdesvkkyfassfewcr 
kaq\h:dlqaegyweelijdttqpaivvkdwysgrsdagclyeltv 
klls ehenvlae fss gqvavpqd s dgggwmh i shtftdygpgvr 
fvrfehggqdsvywkgwfgarvtnssvwvep 


6083 


1865 


309 


kqwcaerrglgmsladelladleeaaeeeeggsygeeeeepaie " 



442 



WO 01/53312 



PCT/USOO/34263 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



SEQ 
ID 
NO: 



TD8T 



6085 



"0865- 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Tor 



2419 



~~}4 56 



13S7 



ToTT 



"TfT 



6088 



1684 



1877 



Ammo acid segment containing signal peptide 
| (A=Alanine, C=Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M»Methionine, N«Asparagine, 
P-Proline, Q«Glutaraine, R=Arginine, 
S*Serine, TsThreonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=»possible nucleotide deletion, 
, \gpo9slb l e nucleotide inser tion) 
i UVQEETQLDLSGDSVKTIAKLWDSkMF AEIMMKIEEYISKC2AXA~ 
SEVMGPVEAAPEYRVIVDANNLTVEIENELNIIHKFIRDKYSKR 
FPELES LVPNALDY I RTVKELGNS IiDKCKNNENLQQ I LTNAT I M 
WSVTASTTQGQQIiS EEELERLEEACDMALELNASKHRI YEYVE 
SRMSFIAPNLSIIIGASTAAKIMGVAGGIiTNLSKMPACNIMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPP1PPPPSVAP\DL 
RRKAARIiVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMS FGE1 EEDAYQE DLGFS LGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVRFTPLQGLEI 
VNPQAAEKKVAEANQKYF3SMAEF LKVKGEKSGLM5T 
■ KQ WCAERRGIjGMSIiADELIj ADIiE EAAEEEEGGS YGEE^EE PAI B 
D VQE E TQLOLS GDS VKTI AKLWDS KMFAE I MMKI EE YI S KQAKA 
SEVMGPVEAAPEYRVIVDANNLTVEIENELNI IHKFIRDKYSXR 
FP ELESL VPNALDY IRTVKEIiGNSLDKCKNNENLQQ 1 LTNAT1 M 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
S RMS F I APNLS 1 1 1GASTAAKI MGVAGGLTNIiS KMPACNI MLLG 

aqrktlsgfsstsvlphtgyiyhsdivqsi*ppipppfsvap\dl 

RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
I ANRMSFGE IEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGIaBI 
VNPQAAEXKVAEANQKYFSSMAEFLKVKGBKSGLMST 

SGPRSFQGNRAVGRISLGGKRNPEVTLLPGVSSERVRRWRRARV 
GVAR VKPGNPWKP SPATQ V PR/ VP AQ VYLPGRGP P LR EGE ELVM 

f DEEAYVLYKRAQTGAPCLS FD I VRDHLGDNRTELPLTL YI*CA.GT 
QAESAQSNRIjMMLRMHNLHGTKPPPSEGSDEBEEEEDEEDEEER 
KPQLELAMVPHYGGINRVRVS WLGE B PVAGVWSE KGCjVEVFALR 
RLLQWSEPQALAAFLRDEQAQMKP I FSFAGHMGEG FALDWS PR 
VTGRLLTGDCQKNIHLWTPTDGGSVfHVDQRPFVGHTRSVEDLQW 

I SP TENTVFASCSADASI R I WD IRAAPS KACMLTTATAHDGDVNV 
IS WS RREPFLLSGGDDGALKI WDLRQ FKSGS PVATFKjQHVAPVT 

| SVBWHPQDSGVFAASGADHQITQWDLG/IVERDPEAGDVBAD^G 

IADLPQQLLFVHQGETELKELHWHPQCPGLLVSTALSGFTIFRT 
ISV 



689 



! GAATQHGGAMNLLPC^PHGNGLLYAGFNQDKGCFACG>m-GFRV 
, YNTDPLKEKEKQEFLEGG VGHVEMLFRCNYLALVGGGKKP KYP P 
NKVM I WDDUCKKT V I E I EFS TE VKAVKLRR\DKI WVLDSMI KV 
I FTFTHNP \HQLHVFE \TC YNPKGLC VLCPNSNNSLLAFPGTHTG 
HVQLVDLASTEKPPVDI PAHEG VLSC 1 ALNLQGTR IATASEKGT 
L IRI FDTSSGHL IQBLRRG SQAANT YCINFNQDASLI CVSSDHG 
TVHIFAAE0PKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 
DD^ GTEPNAV ^ IC ^ SYY ^ L ^ P ^ ECI ^^ ; ^ PLEMT 

QNSQRTGLPiTIFSRSFPLX>TGSDLCENMPC M rCTWRNWRQWIRP 
| LVAVI YLVS I WAVPLCVWELQKLEVG IHTKAWFIAGI FLLLT I 
P I SLWV I LQHLVHYTQPELQ KP I IRILWMVPI YSLDSWIALKYP 
GIAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPNLVLILEAKD 
QQKHFPPLCCCPPWAMGEVLLFRCKLGVLQYTVVRPFTTIVALI 
CELLGIYDEGNFSFSHAWTYLVI innmsqlfamyclllfykvlk 
EELSP IQP VGFCFLCVKLWFVS FWQAWI ALLVKVGVT S E KHT W 
EWQTVEAVATGLQDFI I C1EMFLAAI A\HHYTFS YKPYVQEAEE 
GSCFDSFLAMWDVSDIRDDISEQVRHVGRTVRGHPRKKLFPEDQ 
DQNEHTSLL5SSSQDAIS IASSMPPSPMGHYQGPGHTVTPQTTP 
TTAKISDEILSDTIGEKXEPSDKSVDS 

GA5GLVRLLQQGHRCLLAPVAPKLVPPVRGVKKGFRAAFRFQKE 
LERQRIiLRCP PPPVRRS EKPNWDYHAEIQAFGHRLQENFS LDLIi 
I KTAFVNS CYIKS EEAKRQOLG IEKEAVLLNLKSNQELSEQGTS F 
' SQTCLTQFLEDSYPDMPTEGI KNLVDFLTGEEWCHVARNLAVE 
QLTLSEEFPVPPAVLQQTFFAVIGALLQSSGPBRTALF1RDFLI 
I TQMTGKELFEMWKI INPMGIiLVEELKKRNVSAPESRLTROSG\A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCysteine, D-Aspartic Acid, En 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H*Hiotidine, I-Ieoleucine, K=Lysine, 
L«Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine , V»Valine, 
W=Tryptophan, Y=Tyrosine, XnUnJcnown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVLVAEEEAARVAIiRKLYGF 
TENRRPWNYSKPKETLRAEKSITAS 


6089 


! 3 


3054 


TRLGIPGSTISSRPRU^UiAAEGHFLGHSWTGSRAGAHTGAPAW 

PSRRLRDLPAGGMWRLRRAAVACEVGQSLVKHSSGIKGSLPLQK 

LHLVSRS I YHSHHPTIjKLQRPQliRTSFCXiPSSLTNIiPLRKLKFS 

P I KYG YQPRRNFWPARLATRLLKLR YL ILGSAVGGG YTAKKTFD 

QWKDMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSBDLVK 

LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 

VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKLVIiQKDD 

KG I P FIES LRKS LIDMYSEVLDVLS DYDAS YNTQDHL PRVVWG 

DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 

LFKDSSREFDLTKEBDLAALRHEIELRMRKNVKEGCTVSPETIS 

LNVKGPGLQRMVLVDLPGVINTVTSGMAPDTKETIFS I SKAYMQ 

DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 

AEKNVASPSRIQQIIEGKLFPMKAliGYFAWTGKGNSSESIEAI 

RE YEEEFFQNS KLLKT SMLKAKQVTTRNL5LAVS DCFWKMVRES 

VEQQADSFKATRFNLETEWKNNYPRLREIiDRNELFBKAKNEILD 

BVISLSQVTPKHWEEIIjQQSLWERVSTHVIENIYLPAAQTMNSG 

T FNTT VD I KLKQWTDKQL PNKAVEVAWETLQEE FSR FMTE P KG K 

EHDD I FDKLKEAVKEES I KRHKWND FAEDS LR VI QHNALEDRS I 

SDKQQWI)AAIYF^EEALOARLKDTENAIENlWGPD\WKKRWLYW 

KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNIjE 

SRGVEVDPSLIXOTWHQVYRRHFLKTALNHCNLCRRQFYYYORH 

FVDSELECNDVVLFWRIQRMLAITANTLRQQLTNTEVRRLEKNV 

KEVLEDFAEIXSEKKIKEiLTGKRVQIiAEDLKXVREIQEKLDAFIE 

ALHQEK " ' | 


6090 


194 


1560 


PVFVPAiKSaviiEQAS/ASPPLATQTWPLQHCltiPEXPVQA^lE - 
FELQI^FCQLIAIjFVHYINIYKTVWWYPPSHPPSHTSLNFHIiID 
FNLLMVTTI VLGRRFIGS I VKEASQRGKVSLFRS ILLFLTRFTV 
LTATGWSLCRSLIHLFRTYS FXNLL/FPLLS VWDVHS VPAAELR 

p\rktslfnhmasmgpreavsgiaksrdylltlr\rrgsstqds 

CMARTPCP/PHACCLS PSL I RSEVBFLKMDFNWRMKE VLVSSML 
3 AYYVAFVPVWFVKNTHY YDKRWSCE LFLLVS I S TS VILMQHIiL 
PASYCDLLHKAAAHLGCWQKVDPALCSNVLQHPWTEECWWPQGV 
LVKKS KNVYKAVGHYNVA I PSD VSH FR FHFFFS KPLRILNI LLL 

legavivyqlyslmssekwhqtislalilfsnyyaffkllrdrl 
vlgkays ysas pqrdldhrfs 


6091 


3279 - 


412 


SSRTREMEEKBILRRQIRLLQGIiIDDYKTIiHQNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 

psdppadhavrplhgarggqppvpqqhvlerqvqlsqgqnwik 
vkppsksgsasasgaqrgsleefedtpwsdqrpregegepprgq 
lqpsrptrargtcsvedpllvcqkepgkprmvks vgsvgds pre 
prrtvsesviavkasfpssalpprtgvalgrklgshsvascapq 

LliGDRR VDAGHTDQPVPSGS VGGPARPASGPRQAREAS t»WTCR 

tnkfricnnykwvaassksprvarralsprvaaenvckasagman 
kvekpqliadpepkprkpatsskpgsapskykwkasspsassss 

SFRWQSEAGSKDHASQLSPVLSR3PSGD\RPAVGHSGLKPLSGE 

tplsaykvxsrtkiirrrgstslpgdkksgtspaatakshlslr 
rrqalrgksspvlicktpnkglvqvtthrlcrlppsrahlptkba 
sslhavrtaptskviktryrivkkitasplsappfplslpswra 
rrlslsrslvlnrlrpvasgggkaqpgspwwrskgyrciggvly 
kvs ankjls ktsgqpsdagsrpllrtgrld p agscsrslasravq 
rslaiirqarqrrekrksycmyynrfqrcnrgercpyihdpekv 

AVCrRFVRGTCKlCTDGTCPFSHHVSKEKMPVCSYFLKGlCSNSN 
CPYSHVYVSRKAEVCSDFLKGYCPIiGAKOaCKHTLLCPDFARRG 
ACPRGAQCQLLHRTQKRHSRRAATSPAPGPSDATARSRVSASHG 
PRXP5A5 QR PTRQTPSS AALTAAA VAAP PHCPGGS AS P SSS KAS 
SSSSSSSSPPASLDHEAPSLQEAALAAACSNRLCKLPSFISLQS 
S PS PGAQPRVRAPRAPLTKDSGKPLHIKPRL 


6092 


143 


3190 


AKAP PTGESS EP EAKVLHTRRLYRAVVEAVHRLDLILCNKTAYQ 
EVFKPENISLRNKIjRELCVKLMFIiHPVDYGRKAEEIjLTOKVYYE 
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ID 
NO: 


Predicted 
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corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F» Phenyl alanine, G»Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L*»Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
SnSerine, T«Threonine, V=Valine, 
W«Tryptophan, Y«*Tyrosine, X»Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








V IQL I KTNKKHI HSRS TLECAYRTHLVAG IG F YQHLLL Y IQSH Y 
QLELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDLSRYQNELAGTOTBIiLAERFYYQALSVAPQIGMPFNQLGTIj 
AGS KYYNVEAMYCYLRC IQSEVS FEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKPIKRtiLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
LI FQMVI I CLMCVHSLERAGSKQYSAAI AFTLALFSHLVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFBSDSSHD 

EEEGTRSPTLEPPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQ 
MFQTKRCFRLAPTFSNLUjQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDW LRTNPDL I IVCAQ S SQS LWNRLS VLLNLLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLIiPEDMALRNLPPLRAAHRRFNF 

dtdrpllstleeswriccirsfghfiarlqgsilqfnpevgif 
vs i aqs eqesijjqqaqaqfrmaqeearrnrlmrdmaqiirirqlev 

nLPR ct jnOPKAOSAMSPYTjVPDTOAIiCHHIi P VI ROLATSGR F I 
VIIPRTVIDGLDLIiKKBHPGARDGIRYLBAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAVTTIiYKIIiDSCKQLT\LAQGAGEEDPSG 
MVTIITGLPLDNPSLLSGPMQAALQAAAHASVDIKNVIiDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALRVART/SRWGAL \ RGAVWAPGTRPS KRRAC WALL 
PPVPCCLGCLAERWRLRPAALGLRLPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWG PASTPSLYENPWTI PNMLSMTRIGLAP 
VLG YLI IEEDFNIALGVFALAGLTDLLDGFI ARNWANQRSALGS 
ALDPLADKILISILYVSLTYADLIPVPLTYMI ISRDVMLIAAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFXSKVNTAVQLILVA 
ASLAAPVFNYADSI YLQILWCFTAFTTAASAYSYYHYGRKTVQV 
IKD 


6094 


23 


1010 


PFLRCLRGDQKAKMSERkVLNKY YP PDFDPS KI PKLKLPKDRQ Y 

VVRLMAPFNMROCTOSEYIYKGKKFimKETVQNEVYl^ 

FY I KCTRCLAE I TFKTDPENTD YTMBHGATRNFQAEKLLEBEE K 

RVQ KEREDE ELNNPMKVLENRTKDS KLEMEVLENLQEL KDLNQR 

QAHVDFEAMLRQHRLSEEERRRQQQEEDBQETAALLEEARKRRL 

LEDSDS EDEAAPS PLQPALRPMPTAILDEAPKPKRKVEVWEQS V 

GSLGSRPPLSRLWVKKAKADPDCSNGQPQA/APHPRSPAEQEG 

GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS:14Id257.1(%CSH01!.DOC) 



445 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to firBt 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding . 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami.no acia segment containing signal peptide ' 
(A-Alanine, ^Cysteine, D^Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q«Glutamine, R-Arginine, 
S«*Serine, T»Threonine, V-Valine, 
W»Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGKGADLSKPPCRKAKEIRKBRKRtKtMQQ^PAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQPKATIiLESYQVyKRYQMVIHKNPPDTPTESQFTR 
FLCSS PLEAETPPNGPDCG YGS FHQQ YWLDGKI I AVG VT DI LPN 
CVSSVYLYVDPDYSFLSLGVYSALRBIAFTRQLHEKTSQLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCP2TYVWVPIEQCLPSLENSK 
YCRFNQDP EAVDEDR5 TE PDRLQVFHKRA I M PYGVYKKQQKDPS 
EEAAVLQYASLVGQKCSERMLLFRN ' i 


6096 


2277 


575 


QRVRAALLS S AMEDSEALGFEHMGLDPRLLQAVTDLG WSR PTL I "" 
QBKAI PLALEGKDLLARARTGSGKTAAYAI PMLQLLLHRKATGP 
WEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVANVSAA 
EDSVSQRAVLMEKPDVWGrPSRlLSHLQQDSLKIiRDSLELLW 
DE ADLL FS FGFEEELKSLLCHLPR I YQAFLMSATFNED VQ ALKE 
l» I LHNP VTL KLQESQLPG PDQLQQ FQ WCETEEDKFLLL YALLK 
LSLIRGKSLLFVNTLERSYRLRLFLEQFSIPTCVLNGELPLRSR 
CHI ISQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAG VARG IDFHHVSAVLNFDLPp TPEAYIHRAGRTARANNPG I V 
LTFVLPTEQFHLGKIEELLSGENRGP ILLP YQFRMEE I EGFRYR 
CRDAMRS VTKQAIREARLKE I KEBLLHSEXLKT YFEDNPR \ DIiQ 
LLRHDLPLH PAVVKPHLGHVPDYLVP PALRGLVRPHKK\GRS CL 
PLVGRPREQS PRTH CAAS STKERNS DPQPSPPEWGPLWS 


6097 


■LP r J 


192 


APGTMSGGKKKS S FQITSVTTDYEGPGS PGASDPPtfPQPPTGP"P~~ 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVDVYERDLEPHS FGGLLEG IRGASGGAGGRSLDSRL 
ELASLGLGAPTPPSGLSQGPTSWLRPPPTSPGPQARSFTGGLGQ 
LWPSKAK^EKPPLSASSPQQRPPEPETCESAGTSRAATPLPSL 
RVEAEAGGSGARTP PLSRRKAVDMRIiRMELGAPE EMGQVPPLDS 
RPSS PAIiYFTHDASLVHKSPDPFGAYAAQKFS LAHSMIAI SGHL 
DSDDDSGSGSLVGIDNKIEQAMDLVKSHLMFAVREEVBVLKEQI 
RELAERNAAliEQENGbLRALA\SPEQLGSAGPPRGVPR\LGPPA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 
VQALSNGPWS PGPLPHLLI IPS LDGGGEGFRTGRQQGAP FG EET 
QPPPSLPGTPQQ 


6098 


168 


1074 


WYCLRHRSPUSKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EG K I F KNWGTQTEKEDTSN INPRQTETS VNASRS PEKCAQQRQK 
RI»NS AS QRS SSLPP5NRKS STP TKRE I MLTP VTVAYS PKRS P KB 
NI*S PGFSHLLS KNES S PIRFDI LLDDLDTVP VS TLQRTNPRKQL 
\Q FLPLDDSE EK\TYSE KAT \DN I VNHS SCPEP VPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFBYTAKIRTIAETERFF\D 
ELTKEKDQIEAALSRMPS PGGR ITIjQTRLNQEAFGRS FGKD 


6099 


168 


1074 


NYCIiRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
CAiM r iun w^TQTEKEDTSNINPRQTETS VNASRS PEKCAQQRQK 

rlnsasqrssslppsnrksstptkreifiltpvtvayspkrspke 
nlspgfshllsknesspirfdillddldtvpvstlqrtnprkql 
\qflplddseek\tysekat\dwivnhsscpepvpngvxkvsvr 

TAW2KNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLABTERFF\D 
ELTKEKDQ I EAALSRMPS PGGRITLQTRLNQEAFGRS FGKD 


6100 " 

6101 * 


2 


713 


FVEV3aYRS]UU)PBPRGRDTMTYAYLFKYIIIGDTGVGKSCLI,L 

qftdkrfqpvhdltigvefgarmvniogkqiklqiwdtagqesf 
rsitrsyyrgaagallvyditrretfnhltswledarqhsssnm 
vimlignksdlesrrdvkrebgbafare\hglifmetsaktacn 
veeafintakeiyrkiqqglpdvhneangikigpqqsistsvgp 
sasqrnsrd igsnsgcc 




1 " - 


1399 


frgrawplrevshwlgcrrvcswsaswgrj,palsarlspiZafr~ 

GK>IVFPLSCAVQQYAWGKMGSNSEVARIiIiASSDPIAQIAEDKPY 

aelwmgthprgdakildnrisqktlsqwiaenqdslgskvkdtf 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, GsGlycine, 
HaHistidine, I=Isoleucine, JC=Lysine, 
L=^eucine, M=:Methionine, NaAsparagine, 
PsProline, Q-Glutamine, R«Arginine, 
SaSerine, ^Threonine , V^Valine, 
W«Tryptophan, Y«=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFLFKVLSVETPLSIQAHPNKELAEKLHLQAPOkyPDANH 
K P BMA I ALTP PQGLCG FRPVEE I VTFLKKVPEFQFL I GDEAATH 
LKQTMSHDSQAVASSLQSCFSHLMKSEKKVVVBQLJ7LLVKRISQ 
QAAAGNNMEDI FGELLLQLHQQYPGDIGCFAI YFLNLLTLKPGE 
AMFLBANVPHAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSQBDPYLSIYDPPVPDFTIMKA\EVP 
G\SVTEYKDLALDSAS ILLMVQGTVIASTPTrQTPIPLQRGGVL 
F IGANES VS LKLTE PKDLLI FRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAGHilGASPAAPCCSESGDERKN 
LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 
S L KK LD KL I EQRT VSKMQU3EQVLTI S SE I P KRIRSALKNAE ES 
KQFLNQFLEQEIHLFSAI NSHLLTAQP WMDDLGTMI SQ IBE I BR 
HLAYLKWISQIEELSDNIQQYLMTNNVPBAASTLVSMAELDIKL 
QESS CTHLLG FMRATVKFWHKI LKDKLTS DFEE ILAQLHWP F I A 
P PQSQTVGLS RPASAPE I YSYliETLFCQLLKLQTSHELLTEPK\ 
HSQKNTLFLPPLLSS/WPIQVMLT?LQKRFRYHFRGNRQTNVLS 
KPEW YLAQVLMWIGNHTEFLDE KI Q P ILD K VGS LVNARLE FS RG 
LMMLVLE KLATD I PCLL YDDNLFCHLVDB VLLFERELHS VHG YP 
GTFAS CMHI LS EETCFQRWLTVERKFALQKMDS KLSSE AAWVSQ 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LQKDLVDDFRIRIiTQVMKEETRASLGFRYCTVILNAVNYISTVLA 
D WADNVFFLQLQQAALEVFAENNTLS KLQLGQLASMES S VFDDM 
INLIiERLKHDMIjTRQVDHVFREVKJDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRBHIiLQLEQQLCFSLEKIFWQMLVEKLD 
VYIYQEIILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KH I KEACI VLNLN VGS ALTAGKDVLP VQLQGS FPAT 




207 


2523 


ESNSTMTTYIiEFIQQNEERDGVRFSWNVWPSSRLEATRMWPVA 
ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 
ACNFCYQRNQFPPS YAGI S ELNQPAELLPQFSS IEY WLRGPQM 
PLI FLYWDTCMEDEDLQALKESMQMSLSIiLPPTALVGL I TFGR 
MVQVHELGCEG I SKS YVFRGTKDLSAKQLQEMLGLSKVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGBLQRDPWPVPQGKR 
PIJISSGVALSIAVGLLBCTFPNTGARIMMFIGGPATQGPGMVVG 
DEL K?P I RS WHD I DKDNAKYVKKGTKH FEALANRAATTGHVI D I 
YACALDQTGLLEMKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTLEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENE IGTGGTCQWKI CGLSPTTTLAI YFEWNQHNAPI PQGG\RG 
A\IQFVTQY\QHSSGQRRIRVTTIARN\WADAQTQIQNIAASFD 
QEAAAI LMARIiAI YRAETEEGP0VLRWLDRQL IRLCQKFGE YHK 
DDPSSFRFSETFSLYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 
FMRQDLTQSL1MIQPILYAYSFSGPPEPVLLDSSS I LADRILLM 
DTFFQIL I YHGET IAQWRKSG YQDMPE YENFRHIiLQAP VDDAQE 
ILHSRFPMPRY IDTEHGGSQARFLLS KVNPSQTHNNMYAWGQES 
GAPILTDDVSLQVFMDHLKKLAVSSAA 


6104 


124 




ev.voc i a XLioiu^JViJjr ttAijAnii VLiVVJii'WSAARGVIjRNYWERLLR 
BCLPQSRPGFPSPPWGPAIAVQ\AQPCLQSQQMIPVEVKRI/RSL 
LDSIFWMAAPKNRRTIEVNRCRRRNPQKLIKVKNNIDVCPECGH 
LKQKHVLCAYCYEKVCKETAE IRRQ IGKQEGGFFKAPTIET WL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


989 


piihgactslvlqrfchrrprpcaparpedmrrpaavplllllcf - 
gsqrakaatacgrprmlnrmvggqdtqegewpwqvs iqrngshf 
cggsliaeqwvltaahcfrntsbtslyqvllgarqlvqpgpham 
yarvrqvesnplyqgtassadvalveleapvpftnyilpvclpd 
psvifetgmncwvtgwgspseedllpeprilqklavpiidt\pr 
cnllys kdte fgyqpktikndmlcagfeegkkdackgds agplv 

CLVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRIIPK 
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amino acid 
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Amino acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G»Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine , N»Asparagixie, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +«Stop 
Codon, /=Dossible nucleotide delation 
\=possible nucleotide insertion) 








LQVQPSEVGRPBVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTSIESRGRPAAS 

AGLRRDRCAT^T? WPT.RRAPT JVR &*PPPP2\f* QDOD r'TV'DD T5D 7v nnnp 

W S RARH0 PGG LCLLLLLLCQFMEDRS AQAGN CWLRQAKNG RCQ V 
! LYKTELSKEECCSTGRLS TSWTEEDVNDNTLFKWM I FNGGAPNC 
1 PCKETCENVDCGPGKKCRMNKKNKPRCV CAPDCSN ITWKG PVC 
GLDGKTYRNECALLKARCKEQPELBVQYQGRCKKTCRDVFCPGS 
STC V\ VDQTNNAYCVTCNRI CPEPASSEQYLCGNDGVTYS \ SAC 
HLRKATCLLGRS I GLAYEGKCI KAKSCEDIQCTGGKKCLWDPKV 
GRGRCS LCDELCPDS KSDE P VCAS DNAT YAS ECAMKEAACS S GV 
LLE VXHSGSCNS I SEDTEEEEEDEDQDYS FP I SSI LEW 


6107 


623 


16~8 


SRCSS PRPEPGRGRGK/ LS PSEHRKWVEVFKACDEDHKGYLSRE 
DFKTAWMLFGYXPSKIEVDSVMSSINPNTSGILLEGFLNIVRK 
KKE AQR YRNB VRH I FTAFDTY YRG FLTLEDFKKAFRQVAP KLPE 
RTVLBVFREV\ DRDS\DGHVSF 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPS PMSASRPQS PTTPft 
CLPRRYMKHXRDDGPEKQEDEAVDVTPVMTCVFVVMCCSMLVLL 
YYFYDLLVYWIGIFCLASATGLYSCIiAPCVRRliP\SASAGESA i 
LLAPTIPNNSI*PYFHKRPQARML1*1ALFCVAVSVVWGVFRNEDQ 
WAWVLQDALGIAFCLYMLKTIRLPTFKACTLLLLVIjFLYDIFFV 
FITP FLTKSGS S IMVEVATGPSDSATREKLPMVLKVPRLNSSPt* 
ALCDRPFSLLGFGDILVPGLLVAVCHRFDIQVQSSRVYFVACTI 
AYGVGLLVT FVALALMQRGQ PALL YLVPCTLVTS CAVALWRREL 
GV FWTGSGFAKVL P PS PWAPAPADGPQ PPKDSATP LSPQP PS EE 
tr n i o f w f/uajs f Jti>XJ. a ncPIUAGAPMRSPGS PAESEGRDQAQPS 
PVTQPGASA 


6109 


1 j 


13 81 


CRS RAG AASGGAI LEGTKLRRQRVDTNKPLDPLVPSALRAAMLY 
LEDYLEMI EQLPMDLRI>RFTEMRE^lDIlQVQNA^^DQLEQRVSBFF 
MNAKKNKP EWREEQMAS I KKD YYKALEDADEKVQIiANQI YDLVD 
RHLRKLDQELAKFKMELEADNAGITEILERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTTDHIPEKKFKSEALLSTLTSDA 
SKENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 
GI \TMAAAQAVQATAQMKEGRRTSSLKAS YEAFKNNDFQLGKEF 
SMARETVG YS S S S ALMTTLTQNAS S S AADS RSGRKS KNNNKS S S 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNBPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


24 64 


ACPSAATMS DQDHSMDEMTAWKI EKGVGGNNGGNGNG3GAFS Q 
ARSSSTGSSSSTGGGGQESQPS PLAI.I AATCSR IBS PNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQ 1 1 SSSSGATPTSKEQSG 
SS TNGSNGS ESS KNRTVSGGQ YWAAAPNLQNQQVLTGLPGVMP 
NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQIQI I PGANQQ 
I ITNRGSGGNI IAAMPNLLQQAVPLQGLANNVLSGQTQYVTNVP 
VALNGNI TLLPVNS VS AATLTPS SQAVTI SSSGS QBSGSQP VTS 
GTTI S SAS IiVS SQASSS S FFTNANSYSTTTTTSNMG IMNFTTSG 
SSGTNSQGQTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGE 
Q\NQGTQAAPKSI.SRPQbVQGG\QALQ\AFQAAPLSGQTFTTQA 
ISQETLQNLQLQAVPNSGPI I IRTPTVGPNGQVSWQTLQLQNLQ 
VQNPQAQT I TLAPMQGVS LGQT S S SNTTLT P I ASAAS I PAGTVT 
VNAAQLSSMPGLQTINLSALGTSG IQVHP IQGXiPLAIANAPGDH 
GAQLGLHGAGGDGIHDDTAGGEEGENSPDAQPQAGRRTRREACT 
CP YCKDS EGRGSGDPGKKKQHI CH I QGCGKVYGKTSHLRAHLRVJ 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHLSKKIKTHQNKKGGPGVALSVGTLPLDSGAGSEGSGT 
ATPSALITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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sequence 


Amino acid segment containing signal peptide" 
(A-Alanine,. C=Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, fifc^Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S=Serir.e, TsThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, +«Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 


6111 
6112 


16-37 
77 


797 
196 


RVDPRVRGAMAPWGKRIAGVRGVLLDISGVLYDSGAGGGTAIAG 
• S VEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDI SEQE 
VTAPAPAACQILKERGI»RPyLLIHDGV\ASEFDQIDTS /STPNC 
WI ADAGES FS YQNKNMAFQVLMELEKPVLI SLGW3RYY KBTSG 
LMIiD VGP YMKALE YAGGI KAEVGGKPS PE FFKS ALQAIGVE AHQ 
AVMI GDD I VGD VGGAQRCGMRALQ VRTGKFRPSDEHHPE VKADG 
YVDNLAEAVDLLLQHADK 


6113 


1779 


567 


MSSHKSFKSKRFLAKKOKPNRPILQWIWLKTGNKXRHNWK 
WBGRSWAAo^IiO^WGERSGVRASEAESPGkfeADVSWWSRQL 
ETMVDHLANTEINSQRIAAVESCFGASGQPLALPGRVLLGEGVL 
TKECRKKAKPRIFFLFNDILVYGSIVLNKRKYPQOHTtdt vmrr 

LELLPETLQAKNRWMIKTAKKSFWSAASATERQEWISHIEECV 
RRQLRATGRPA\STEHAaPWIPDKATDICMRCTOTRFc?AT.tpou 

HCRKCRVWCAECSRQRFIiLPRLSPKPVRVCSIiCYRELAAQQRK 
EEAEEQGAGVPRAASHLARP I CGRPVEMTMTPTRTRRAAGTATO 
PAAWSSTPRGWPGLPSTADPRPAEHLSPSQLHCPGPQEGSSRSC 
PGLRD P I PWKQ VQRWGVALSGLPVPFCWTIjCP YGFTAGNAFPFR 
KPQNTHRSW 


6114 


818 


246 


PT5RPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCS PRCGLAAGSMCSCS PSWRCT PVPACWPSPPP \PAEQVQC 
GHLP PHADRRAIiRIjPVAAPARGPGPGHPAGPAGPRPARTP PAS ° 

HGPGRP TVpAPPCPLLAATE PTPSR PHQRWTREDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6115 


324 


71 


D VCGRVCAHPHLYTH I HMH I CAHAC \ I HTHAQLC/ I TASHALAH 
SHLYTCWl/MLTASHT PSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 
6117 


595 


1430 ■ 


TUVMPPGRWHAA/ ISSSGPVFEGARA\LQTVKKEEBDESYTPVQ 
AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWW 
LRPDVLSKAQILELLVLEQFLSILPGELRVWVQLHWPESGEE\L 
WPCWRS CRGTLh5GHPGGTRALP\EPRCAliDGYRS\LRSAQl WS L 
ASPLRSSSALGDHLEPPYEIEARDFIiAGOSDTPAAOMDaT t7ddt? 

GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 




1433 


222 


VGVPSPAPPCSWBVGPGGGWtPCjiUOjGO^RRTPI^LIiATRTR ' 
GLLSLFPPAAMHPAAFPLPVWAAVLWGAAPTRGLIRATSDHNA 
SMDFADLPALFGATLS QEGLQGFLVEAHPDNACS PI APPPPAPV 
NGSVFIALI^RFT)CNFDLKVI^AQKAGYGAAVVHNVKSNELLNM 
VWNSEBIQQQrWIPSVFIGERSSEYtiRALFVYEKGARVLLVPDN 
TFPLG YYL I P FTG I VGLLVLAMGAVMI ARC I QHRKRLQRNRLTK 

\EQI»KQI \ pthdyqkgdqyd vcai CLDB yedgdklrvlpcahay 

HSRCVDP WI*TQTRKTCPI CKQ PVHRGPGDEDQE EETQGQEEGDB 

GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


6118 
«119 " 


1044 


247 " 


STISCRACTSGATPGAQSHRSARGHAAGGKEtAALGMERGKVKK " 
KEKEKETQKEXI GE KGREEKVKRXEVEQKI KQEKQEKQERRKGK 
EKEEKRTKQGKETNKE KEQFKGQEEKGENKDS TLTRTPLE PLEK 
NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCI>JTE 
DSQMEFLBIGGSKPFRSYWEMYLSN/ADSLARSFSVGFKQDSQP 
ITWKAKKYLHQLIAANPVLPLWFANKOJDLEAAYHITDIHEAIA 




1217 


462 ' " 

< 


UPRFVTENTTKAPAQBRTTQPRSSREGTLRSTMEYLSALNPSDL " 
LRSVSNISSEFGRRVWTSAPPPQRPFRVCDHKRTIRKGLTAATR 
2ELLAXATjKTLLLNGVLTLVIjEEIX5TAVDSBDFFQLLEDDTCLM 
VLQSGQSWSP7RSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 

dlfgsijivkatfygi.ysmscdfg^lxgpkkvimllrwtstllq 
3lghmllg i sstlrhavegaeqwqqkgrlhs y 
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6120 


785 


179 


LE RAGGGQIiS SRALVGSGACLS LVARANGKGLPRGRKE FVEAVR 
VRYVAFR YRTPRAVCLRLWS CRREVI MS GRGKQGGKVRAKAKSR 
S S RAGLQ F P VGR VHRLLRKGNYAER VGAGAP VYLAAVLE Y LTAE 
I LE LAGNAARDN KKTR 1 I P RHLQLAI RNDEE LNKLLGKVT I AQG 
G \ VLPNIQAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


F VRAQARGS RQP VRR P LLGAGS RLRCRS CGRME PL K V E KFATAN 
RGWGLRAVTPLRPGELLFRSDPLAYTVCKGSRGWCDRCUiGKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRBCKCLKSCKPRyPPDS 
VRLLGR WFKLMDGAPS BSEKL YS F YDLESN INKLTEDKKEGLR 
QLVMTFQHFMREE IQDASQLPPAFDLFEAFAKVI CNS FTICNAE 
MQEVGVGL YPS ISLLNHSCDPNCS I VFNGPHLLLRAVRDIEVGE 
BLTICYLDMLMTSEBRRKQLRDQYCFECD\CFRCQTQDKDADML 
TGDEQVWKEVQESLKKIEEIiKAHWKWEQVLAMCJQAIISSNSERL 
PDINIYQLKVLDCAMDACINLGLLEEALFYGTRTMEPYRIFFPG 
S H P VRGVQ VM KVGKLQLHQGM F P QAMKNLR LA FD I MR VTHGREH 
SLI E DLILLLE/AMRRQHQS ILRERSQREI RRVSLLNALLRSHT 
LCFVSCVNLSYWKFCSVFV 


6122 


2 


2323 


kFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGDG 
NTGTQTNGLDFQKQPVPVGGAISTAQAQAFIiGHLHQVQIiAGTSL 
QAAAQSLNVQS KSNEESGDSQQPS Q PSQQPS VQAAI PQTQLMLA 
GGQITGLTLTPAQQQLLLQQAQAQAQLLAAAVQQHSASQQHSAA 
GAT I S AS AATPMTQ I PLS Q PI Q I AQDLQQLQQLQQQNLNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 
LLQSQPRI \TLTSQPATPTCTIAATP IQTLPQSQSTPKRIDTPS 
LEEP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGLAMVKLYGND 
FS PTTI FRFEALNLSFKNMCKLKPLLEKWLNDAENLSSDSS LSS 
PSALNSPGIEGLSRRRKKRTSIEA\MIRVALEKSFLEN\QKPTS 
EE1TMIADQLNMEKGVIRVWFCMRRQKEKRINPPSSGG\TSSSP 
I KA I F PS PTSLVATTPSLVTSS AATTLTVS PVL PLTS AAVTNLS 
VTGTSDTTSNNTATVISTAPPASSAVTSPSLSPSPSASASTSEA 
SSASETSTTQTTSTPLSSPLGTSQVMVTASGLQTA/AQLLPFKG 
AAQLPANASIAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLATIQALASGGSLP I T S LDATGNL VFANAGGA 
PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 
TSTSAES IQNS L F TV AS AS GAAS TTTTAS KAQ 


6123 


3 


2944 


HLLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGILHL 
HLQPLEM KR VGWFTPAD YGKVTSLI L I RNNLTVIDMIGVEG FG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQILS I T 
KNFKVENIGPLPITVSSLKINGYNCQGYGFEVftDCHQFSLDPNT 
SRDIS I VFTPDFTSS WVIRDLS LVTAADLEFRFTLNVTLPHHLL 
PLCADWPG PSWEESFWRLTVF FVSLSLLGVIIi IAFQQAQ Y 1 1M 
EFMKTRQRQNAS S S S QQNNGPMDVTS PH5 YKSNCKNFLDT YGPS 
DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 
HKTSTAAASSTSTTTEEKQTSPLGSSLPAAKEDICTDAMRBNWI 

Q T .T? V21 ^ fZ T W17MT -f\ V~F\TT TT T3VMT T KI^C , T?KT»Pf m^TTTTrDOlTn^nnMn 

o utf. X Ma \a x e* v W UJrJH u I L e KTJ Li JjN K c. iu NT LKNTIVFSNPSSECS 
MKEGIQTCMFPKETDI KTS ENTAE FXERELC PLKTS KKLPE NHL 
PRNSPQYHQPDLPEISRKNNGNNQQVPVKNEVDHCENLKKVDTK 
PSSEKKIHKTSREDMFSEKQDIPFVEQEDPYRKKKLQEKREGNL 
QNLNWS KS RTCRjuVKKRGVAPVSRPPEQSDLKLVCSDFERSELS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDS VSQNDFPSEAPI SLNLS HIT I CNPMTGNS LPQ Y 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 
NGVPCVIQESAPVHNSFIDWSATCEGQFSSAYCPLBLMDYNAFP 
EENMNYANGFPCPADVQTDFIDHNSQSTWNTPP\NMPAS\WGNA 
QFPSS SRP YLKSTPKACLPMSGLFGP I \HAP \QSDVYENCCP IN 
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PTTEHSD/THMENQA\WCKB^PC5F\NPF^YMNLDIWTTT\A^ 
NRNANFPLSRDSS YCGNV 


6124 


1573 


236 


S DEALRLAGERGMGRVQL FBI SLSHGR WYS PGE PLAGT VRVRL 
GAPLPFRAIRVTCIGSCGVSNKANDTAWWEEGYFNSSLSLADK 
G S LPAGEHSFPFQ FLLPATAPTS FBG P FGKI VHQ VRAAIHTPR F 
S KDHKCS LVFYI LSPLNLNS I PDIBQ PNVAS ATKKFS YKLVKTG 
SVVLTASTDLJ^GYVVGQALQLHADVENQSGKDTSPVVASLLQKV 
S YKAKRW I HDVRTIAEVEGAG VJCAWRRAQWHEQILVPALPQSAL 
f\j t-i jj 1 ti x D YYLQ v 5LKAPEAT VTLP VFI GNI AV/NPCPSEP PA 
R PGAAS WG PTPGG \ PSAP PQEEAEAE AAAGG PH FLD P VFLST KS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 
PSLTPES 


6125 


1 


904" 


KTCP KLTCAFTVS VP DSCCRVCRGDGELSWEHSDGDI FRQPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
x v ux v ijnw KHKHGQ VCVS NGKT YSHGBS WHPNLRAFG I VECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCX3EETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


ii24 


389 


rl.lseapcprsrrrfqmnpewgoafvhvavagglcavavf¥gif 
dsvsvqvgyehyaeapvaglpaflampknslvnmaytllglswl 
hrggamglgprylkdvfaamallygpvqwlrlwtqwrraavldq 
wltlpifawpvawclyldrgwrp\wlfi^lecvslasyglauji 
pqg fevalg ahwpavgqalrt \ hrh yg /s at psat ylalgvls 

ci^fvvlklcdhqiarwrlfqcltghfwskvcdvlqfhfaflfl 
thfnthprfhpsggktr 


6127 " 


1335 


463 


vlprrclvfwntmdssreptlgrldaagfwqvwqrfdadekgy " 
ieekeldafflhmlmklgtddtvmkanlhkvkqqfmttqdaskd 

GRIRMKEI^MFLSEDENFZJjLFRRENPLDSSVEFMQIWRKYDA 
DSSGFISAAELRNFUeDLFLHHKKAXSEAKLEEYTGTMMKIFDR 
NKDGRLDLNDLARILAIiQENFLLQFKMDACSTBKRKGDFEKIFA 
YYDVS KTG ALEGP \ EVDGF VKDMMELVQP9 1 SG V0LDKFRE I LL 
RHCDVNKDGKIQKSBLALCLGLKINP 


6128 


2511 


843 


TGRMSRRQLBRW VWS SQQVQARGRNVRAPRIjGKI AMGIiBMS SKD 
cjj^ajjLRaKAWKuAQKPQSAWCGGRKTRVYATSSRRAPPSEGT^ 
GGAARPEKTAEEGPPAAPGSLRHSGPLGPHACPTALPEPQVTSA 
MSSQ WG I EPL Y I KAEPAS PDS P KGS S ETETE P PVAtAPG \ PAP 
TRCLPGHKEEEDGEGAGPG EQGGGKLVLSSLPKRLCLVCGDVAS 
GYHYG VAS CEACKA FFKRT I QGS IE YS CPASNECE ITKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPG? 
FPAGPLAVAGGPRKTAAPVNALVSHLLWEPEKLYAMPDPAGPD 
GHLPAVATLCDLFDREIWTISWAKSIPGFSSLSLSDQMSVLQS 
VWMEVLVI/5 VAQRSLTLQDELAFAB YLVLDEEGARPAGLGELG \ 
AALLQLVRRLQALRLEREKYVIiLKAIAIANSDSVHI EDEPRLWS 
SCBKLLHEALLEYEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 
KVLAHFYGVKLEGKVPMHKLFLEMLEAMMD 


6129 
6130 


1764 
3 


771 
577 


ARFARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDIaAK 
HPCNASMECDKCQRRQKNRAF C YFCNSVQKL PI CAQCXJKTKCMM 
KSSDCVI KHAGVYSTGLAMVGAI CDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQVLEAETFKCVSCNRLGQHS CLRCKACFCDDHTRSKVFKQEKG 
KQP PCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGA5GYDA 
YWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DL FTNLNLGRTYASG YAHYEEQEN 

GRGGTMREYKVVVLGSG\GVGKSALTV\QFVTCTFIEKYDPTIE 
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DF YRKE I E V\DSS PS VAGI S WTQQGTEQF\ASMRDL Y I KKGQGC " 
ILVYSLVNQQSFQ\DIKPMRDQIIRVKVSEKVPVI\LVGN\SVD 
LES ERE VS S S EGRALAEE WGC P FMETSAKSKTM VDELFAE I VRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 

PRSLSAMRLLPLAPGRIiRRGSPRHLPSCSPALLLLVLGGCLGVF 
GVAAGTRRPNWLLLTDDODRVTinnMTPT.WTif a t rr-VMnntn-oo 

S AY VPS ALCC PSRAS I LTG KYPHNHHVVNNTLEGNCS S KSWQKI 
QE PNTPPAI LRSMCGYQTFP\AGKYLNE YGAPDAGGLEHVPLGW 
S YW YALBKNS KYYNYTIiS ING KAR KHGEN YSVDYLTDVLANVS L 
DFLD YKS NFS P FFMMTATP \APHS PWTAAPQ YQKAFQNVFAPRN 
KNFNIHGTWIOIWLIRQAKTPMTMSSIQFIiDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYI F YTS DNGYHTGQFSLP I DKRQLY 
EFD I KVPLLVRGPGI KPNQTS KMLVANI DLG PTI LD1 AGYDLNK 
TQMDGMS LLP I LRGASNLTWRS DVLVE YQGEGRNVTDPTC PSLS 
PGVSQCFPDCVCEDAYNNTVAfVP TMQnT.MMT nvnjcnrvMn rmt 

BVYNLTADPDQITWIAKTIDPELLGKMN YRLMMLQS CSG PTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 


6132 




1241 


AAGLLPPGLVPEDPRRTRNLLPFG I QGPP FALS R PLFS C VESG W 
AWEAMBPEFLYDLLQLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEW INATLLPEHIWRSLEEDMFDGL I LHHL 
FQRLAALKLEAEDIALTATSQKHKLTVVLEAVNRS\CSWRSGRP 
SGA/WES I FNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVITI 
ESTKSGLKSEKLVEQLTEYSTDKDEPPKDVFDELFKLAPBKVNA 
VKEAIVNFVNQKLDRLGLSVQNLDTQFADGVILLLLIGQLEGFF 
LHLKEFYLTPNSPAEMLHNVTLALELL/IGRGPAQLPC/LALK/ 
TI VNKDAKSTLRVL YGL FCKHTQKAHRDRTPHGAPN 


6133 


2 


42** 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSVVEDYNSVDKT 

TTVS VS QQPVSAP VPIAAHAS VAGHLS TSTTVS S SGAQNSDST K 

KTLVTLIANNNAGNPLVQQGGQPLILTQNPAPGLGTMVTQPVLR 

PVQVMQNANHVTSSPVASQPIFITTQGFPVRNVRPVQNAMNQVG 

IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 

STMPVRPTTNTF1TVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 

TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPS PPAVSIAS FVT 

VKRPGVTGENSNEVAKLVNTLNTI PS LG QSPGP VWSNNSSAH\ 

GSQRTSGPESSMKVTSS1 PVFDLQDGGRKI CPRCNAQFRVTEAL 

RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPS PEKTAPVAS 

/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 

GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHKKHHVE 

LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHS PYESTTKC 

KICEWAFESEPLFLQHMKDTHKPGE^5PYVCGVCQYRSSLYSEVD 

VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 

SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 

I QKRAVRKMS VMGRQTCLECS FE I PDFPWHFPrYVHCSLCR Y$T 

CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 

S VGDAMA KHLVFNP SHRS S S I L PRGLTWI AHSRHGQTRDRVHDR 

NVKNMYP P PS FPTNKAATVKSAGATPABPEELLTPLAPALPSPA 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 

LASGGGGSGGVGKKEQLSVKKIiRWLFALCCNTEQAAEiJFRNPO 

RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 

VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 

VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 

DTBVLSSDDRKENALQTVGTGB PWCDWLAI LADGTVLPTLVF Y 

RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 

RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
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DVCI KRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGE V 
LGVI GDCPELVQRS FLVAS VLPG P DGNINS PTR N ADMQE E L I AS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


S134 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAJlASVAGHLSTSTTVSSSGAjQNSDSTK 
KTLVTLI ANNNAGNPLVQQGGQPL 1 LTQNPAPGLGTMVTQ PVLR 
P VQVMQNANHVTSS P VASQPI F I TTQGFPVRNVRPVQNAMNQVG 
I VLNVQQGQTVRPI TLVPAPGTQFVKPTVGVPQVFSQMTP VRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPS PPAVS I ASFVT 
VKR PGVTGENSNE VAKL VNTLNTI PS LGQSPGP VWSNNSSAH\ 
GSQRTSGPBSSMKVTSS IPVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSBPSVPSAAKPPSPEKTAPVAS 
/THPS STPI PALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFY Y 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNN IRFMNHMXHHVE 
LDQQNGEVDGHTI CQHCYRQ FSTPFQLQCHLBNVHSPY ES TTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVOQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLiCVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEG1.KPGTKVTIRA 
SRGQPRTVPVS SNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VMGRQTCLE CS FE ZPD FPNHFPT YVHCS LCRYST 
CCS RAYANHM INNHVPR KS PKYLAIiFKNSVSG I KLACTS CTF VT 
S VGDAMAKHL VFNPSHRS S S I LPRGLT W I AHS RHGQTRDR VHDR 
NVKNMYP PPS FPTNKAATVKSAGATPAE PEELLTPLAPALP S PA 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQBPE 
LASGGGGSGGVGKKEQLS VKKLRVVLFALCCNTEQAAEH FRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQliP 
VNEETLFQKATK IGRSLEGGFKIS YEWAVRFMLRHBLTPHARRA 
VAHTLPKDVAENAGLFlDFVQRQIHNQDLPLSMrVAIDEISLFL 
DTEVLS SDDRJKENALQTVGTGE P WCDVVLAI LADGTVLPTLV FY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPIi 
DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLiQLVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLME I 


6135 


2 


4256 


FVHGSMADTDIiFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVSVSQQP VSAPVP IAAHAS VAGHLSTSTTVS S SGAQNSDSTK 
KTLVTLI ANNNAGNPLVQQGGQ PL I LTQN PAPGLGTMVTQP VLR 
P VQVMQNANHVTS S P VASQP I FI TTQGFPVRNVRPVQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS I ASFVT 
VKRPGVTGENSNE VAKLVNTLNT I PSLG QS PG P WVSNNS SAH\ 
GSQRTS G PESSMKVTS S I PVFDLQDGGRK1CPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSS T P I PALS PP Y/ TKVPE PHENVGDAVQTKLIMLVDDF Y Y 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNN I RFMNHMKHHVE 
LDQQNGBVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KI CEWAFE SEPLFLQHMKDTHKPGEMPYVCQVCQ YRSS LYSEVD 
VHFRM IHE DTRHLLCP YCLKVFKNGNAFQQH YMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVS SNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
I QKRAVRKMS VMGRQTCLECS FE I PDFPNHF PTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGI KLACTS CTFVT 



453 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D-Aspartic Acid, e« 
Glutamic Acid, F-Phenylalanine, G*Glycine, 
H-Histidine, I*Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P«Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








SVGDAMAKHLVFNPSHRSSSII.PRGLTWIAHSRHGQTRDRVHDR 
NVKNM YP PPS FPTNKAATVKS AGATPAE P E E LLT P LAP ALP Q PA 
STATP PPTPTHPQALALP PLATEGAECLNVDDQDEGS P VTQE PE 
IiASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHPRNPQ 
RRIRRWLRRFQASQGBNLEGKYLS FEAEEELAEWVLTQREQQIjP 
VNEETLFQKATKIGRSLBGGFKISYEWAVRFMLRHHLTPHARRA 
YAHTL PXD VAENAGL FX DFVQRQI HNQDLPLS MTVA1 DE I SLFL 
DTEVLS S DDRKENALQ WGTGEPWCDVVLAI LADGTVLPTLVFY 
RGQMDQPAjmFDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
R3KGMLVMDCHRTHI»SEEVLAMLSASSTLPAWPAGCSSK1QPL 
DVCI KRTVKNFLHKKWKEO^REMADTACDSDVLLQLVLVWLGEV 
LGVI GDC PEL VQRS FLVASVLPGPDGNINS PTRNADMQEEL IAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFBGBSETBS 
FYGFEEADLDLMEI 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRLGQVAS 
SLFRGEHHSRGGTCRLASLFSSLEPQIQPVYVPVPK\ ESALASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKILDDTEDTWSQRXKIQ 
INQEEERLKNERTVFVGNLPVTCNKKKLKSFFKEYGQIESVRFR 
SLIPAEGTLSKKLAAIKRKIHPDQKNINAYWFKEESAATQALK 
RNGAQI ADG FRI RVDLASETS SRDKRS VFVGNLP YKVBE SAIEK 
HFLDCGSIMAVRIVRDKMTGIGKGFGYVLPBNTDSVHLALKLNK 
SELMGRKLRVMRSVNKEKFKG^NSNPRLKIWSKPKOX3LNFTSKT 
AEGHPKSLFIGEKAVLLKTKKKGQKKSGRPKKQRKQK 


*137 


141 


2656 


RALRKRROGPGRRGALGSGPGPQRRPGRVPBERPAPPRERICHPG 
MWNMLIVAMCLA\LIX3LPGKAQELQGHVS\IILAGEQLGDLAKK 
YLWQG \LFQLYLDEAGRGHS PS FHG AALTAPKQGQELMAKALES 
LSCP KDMAPSHCAEHKDQ FLQLSQ YRQLKTAED YQALNKD I EAQ 
LQHAGLREAGG I FYFS VP PFAYEDIARNINSSCRPGPGAWLRW 
LEKPFGHDHFSAQQLATBLGTFFQEEEMYRVDHYLGKQAVAQIL 
PFRDQNRKALDGLWNRHHVERVEI IMKETVDAEGRTSFYEE YGV 
I RDVLQNHLTEVLTIiVAMELPHNVSSAEAVLRHKLQVFQALRGL 
QRGS AWGQYQS YSEQVRRELQKPDS FHSLTPTFAGVLVH I DNL 
RWEG VPFI LKSG KALDER VG YARILFKNQACC VQS EKHWAAAQS 
QCIiPRQLVFHIGHGDLGS PAVL VSRNLFRPSLPS S WKEMEG PPG 
LRLFGSPLSDYYAYSPVRERJDAHSVLLSHIFHGRKNFFITTENL 
LASWNFWTP ItLE SLAHKAP RL YPGGAENGRLLDFEFSS GRLFFS 
QQQ P BQLVPGP G PGPMPS D FQ VLRAKYR ES SL VS AWS E EL I S KL 
ANDIBATAVRAVR R FGQFHLALSGG S S PVALFQQLATAHYG FPW 
AHTHLWLVDERCVPL3 DP ES NFQGLQAHLLQHVR I P YYNIH \AM 
PVHLQQRLCAEEDQGAHI YARE ISALGANSS FDLVLLGMGADGH 
TASLFPQSPTGLDGEQLWLTTSPSQPHRRMSLSLPLINRAKKV 
AVLVKGRM KRE I TTLVSR VGHE PKKWP ISGVLPHSGQLVWYMDY 
DAFLG 


6138 


4587 


934 


EFSkLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLF^FL" 
TDTSHLLSAVKGQERFSLYQTRSLIHELKMKEIHFQRRRTTCAL 
TLEAGEKLLLTTDLKTKES VGRRI SQLQDSWKDMEPQLAEMI KQ 
FQSTVETWDQ^EJCKIKELKSRLQVLKAQSEDPLPELHEDLHNEK 
ELI KELEQSLASWTQNLKELQTM KADLTRHVLVEDVMVLKEQ I E 
HLHRQWEDLCLRVAI R KQE I EDRLNSVJ VVFNEKNKELCAWLVQM 
ENKVLQTADI S IEEMI EKLQKDCMEEINLFSENKLQLKQMGDQL 
IKASNKSPJtAEIDDKLNKINDRWQHLFDVXGSRVKKLKETFAFI 
O^LDKNMSNLRTWLARIESELSKPVVYDVCDDQEIQKRLAEQQD 
LQRDIEQHSAGVES VFNICDVLLHDSDACANETECDS IQQTTRS 
LD RRWRN I CAMS M ERR MK I EETWRLWQKFLDDYS RFE DW L KS AB 
RTAACPNS SEVLYTS AKEELKR FEAFQRQIHBRLTQLEL INKQY 
RRLARENRTDTASRL KQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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WaTryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








NQREEFEGTRESILVWLTBMDLQLTNVEHFSSSDADDKMRQLNG 
PQQEITLNTNKlDQLIVFGBQLIQKSEP\LDAVLIEDEtiEELHR 
YCQEVFGRVSRFHRRLTSCTPGLEDEKEASENETDMEDPREIQT 
DSWRKRGESEEPSS PQSLCHLVAPGHERSGCETPVS VDS \ I PLE 
WDHTGRRGGPSSSH\EEDBEAQYY\ SALSGKS ISDGHSWHVPDS 
PSCPBHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEG PRVLNGNP QQEDGGLAG I TEQQS GA FDRWEM I QAQ E L \ HNK 
LXI KQNLQQLNSDI SAITTWLKKTEABLEMLKMAKPPSDIQEIE 
LRVKRLQEILKAFDTYKALVVSVNVSSKEFLQTESPESTELQSR 
LRQLSLLWEAAQGAVDSWRGGLRQSLMQCQDFHQLSQNLLLWLA 
SAXNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQBISNSLLIKGHGEDCIEAEEKVHVl\EKKIiKQLREQVSQDLM 
ALQGTQN P AS PLPS FDEVDSGDQPPATS VPAPRAKQFRAVRTTE 
GEEETESRVPGSTRPQRSPLSRWRAALPLQLLLLLLLIiLACLL 
P S SEEDYS CTQANNF \ARS F YPMLR YTNGPPPT 


6139 


52 


1131 


LGD WVWSRTCCVLETPTS VLRRARARGPCPTDS KWALPRLREGE 
TERRPWEAS S WKTL /LAG W IGG AAS VI VGHP LDTVKTRLQAG VG 
YGNTLS CIRWYRRESMFGFFKGMS FPLASIAVYNS WFGVFSN 
TQRFLSQHROGEPEAS PPRTLSDLLLASMVAGWSVGLGGPVDL 
IKIRLQMQTPPVSGRQPRFEVQGSGSCG\BPAYC2GPVHCITTIV 
RNEGLAGLYRGASAMLLRDVPGYCIiYFIPYVFliSEWITPEACTG 
P S P CAVWLAGGMAG A I SWGTATPMDWKSRLQADGVYLNKYKGV 
LDCI SQS YQKEGLKVFFRGI TVNAVRGFPMSAAMFLGYEIiSIiQA 
IRGDHAVTSP 


6140 


694 


136 


RPELELWRLRSRSWRPLGVPRRCHRRNWKBPVRA^PLSVTVWAP 
RCQRP/QPPAPEPSSPNAAVPEAI PTPRAAASAALELPLGPAPV 
SVAPQAEAEARSTPGPAGSRLGPET FRQRFRQFR YQDAAGPREA 
FRQLREL/SPRQWLRPDI \RTKEQ\ IVEMLVQEQLIiAILPEAAR 
ARRIRRRTDVRITG 


6141 


2 


984 


AQVGPRSRPCKMPIjKIiRGKKKAKSKETAGLVEGEPTGAGGGSLS 
ASRAPARRLVFHAQLAHGSATGRVEGFSSIQEIiYAQIAGAFEIS 
PSEILYCTLNTPKIDMERIiLGGQLGLEDFIFAHVKGIEKEVNVY 
ksedslgltitdngvgyafi KRIKDGGVIDSVKTICVGDH I ESI 
NGENIVGWPJIYDVAKKLKELKKEELFTMKLIEPKKAFEIELRSK 
AGKS 3GEKI GCGRATLRLR S KGPATVEEMPSETKAK\ AIEK I DD 
VLEL YMG I RDI DLATTMFEAGKDKVN PDE FAVALDETLGDFAFP 
DEFVFD VWGVI GDAKRRGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARIGPGVMESKEERALNN 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FRVRQP I LQYRWD IMHRLGE PQARMREENMERIGEE VRQLME KL 
REKQLSHSIiRAVSTDPPHHDHHDEFC\LMP 


6143 


2802 


216 


FRMRIFLHCPWNQQMWKIWNLLETSIiESCKAHLSIQKLIiKERXQ - 
\QLPVTKHRDS I VETLKRHR VVVVAGET\GSGKSTQVPHFLLED 
LIiLNE WEASKCNI VCTQPRRI SAVSLANRVCDELGCENGPGGRN 
odc>vj luiKinriOiuiLAd jljkjujux l.1 1\jV JjJjKKJLUJiiajJjIj&nvs / HM 
FI VDE V\HER\S VQSDFLL I I LKE I LQ KRSDLHL I LM S ATVDS E 
iCFSTYFTHCPILRISGRSYPVEVFHIiEDIIEETGFVLEKDSEYC 
QKFliEEEEEVTIIOTSKAGGIKKYQEYIPVQTGAHADLNPFYQK 
YS5RTQHAI LYMNPHKINLDLI LELLAYLDKSPQFRNI EGAVLI 
FLPGLAHIQQLYDLLSNDRRFYSERYKVIALHSILSTQDQAAAF 
TLPP PGVRKIVLATN IAETG I TI PDWFVI DTGRTKENKYHES S 
QMSSLVETFVSKASALQRQGRAGRVRDGFCFRMYTRERFEGFMD 
YSVPE I LRVPLEELCLHIMKCNLGSPEDFLSKALDP PQLQVI SN 
AMNLLRKIGACEl^EPKLTPLGQHLAALPVNVKIGKMLIFXSAIF 
GCLDPYATLAAVMTEKSPFTT P IGRKDEADLAKSALAMADSDHI* 
T I YNAYLGWKKARQEGGYRS E ITYCRRNFLNRTSLLTLEDVKQE 
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paProline, QoGlutamine, R=Arginine, 
S=Serine, TaThreonine, V-Valine, 
W^Tryptophan, Y«Tyrosine, X -Unknown, *«Stop 
Codon, /apossible nucleotide deletion, 
\«possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGNRASQTLSFQEIALLKAVLVAGL 
YDNVGKI I YTKS VDVTEKLAC I VETAQGKAQVH PS S VNRDLQTH 
GWLLYQE KI RYARVYLRETTL ITPFPVLLFGGDIEVQHRERLLS 
IIXSWIYFQAPWIAVIFKQIjRVLIDSVLRKKLENPKMSLENDKI 
LQIITELIKTENN 


6144 


1289 
» 


568 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVERYVHDRFLVGPYQN 
VSASGGARHGGRGSGGPVXCT YGPDLFPLVA\ TIGAAFVAKVMS 
VGDRTVTLGI W DTAGS ER YEAMS R I YYRGAKAAI VC YDLTDS S S 
FERAKFWVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV 
QD YADN I KAQLFE TS S KTGQS VD ELFQKVAED YVS VAAFQVMTE 
DKGVDLGQKPNPYFYSCCHH 


6145 


1109 


196 


GGMDLS EL E RDNTGRCRLSS P VPAVCRKE P CVLG VDEAGRG PVL 
GPMVYAICYCPLPRLADIjEALKVADSKTLLESERERLFAKMEDT 
DFVGWALDVLS PNL ISIS MLGRVKYNLNSLSHDTATGLI Q YALD 
QG VNVTQVFVDTVGMPET YQARLQQS FPG I EVTVKAKADALYP V 
\VS AAS I CAKVARDQAVKKWQFVEKLQDLDTD YG\SG YPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGIiESTTS 
L 


6146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVLTDEQKS 

R/YPGHEAHDG£G\WI^QSIIRKVVDPETGRT^ 

BI VTKERHRE INKQATRGDCLAFQMRAGijLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDM\HIQIRALDSDMQTLVYENYNKFI SATDTIR104KNDFRKME 
DEMDRLATNMAV ITDFS AR I SATLQDRHERI TKLAGVHALLRKL 
QFLFELPSRLTKCVELGAYGQAVRYQGRAOAVLQOYQHLPSFRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEQAECVELLLALGEPA 
EELCEE FLAHARGRLE KELRNLEAELG PS PPAPDVLEFTDHG\ S 
SG F VGGLCQ VAAAYQELFAAQGPAGAEKLAAFARQLGS RY FALV 
ERRLAQEQGGGDNSLLVRALDRFHRRLRAPGALLAAAGLADAAT 
E I VER VARERLGHHLQG LRAAFLGCLTD VRQALAAP R VAG KEG P 
GLAELLANVAS S I LSH IKASLAAVHLFTAKEVS FSNKPYFRGEF 
CS QG VREG L I VG FVHSMCQTAQSFCDS PGEKGGATP PALLLLLS 
RLCLDYETATISYILTLTDEQFLVQDQFPVTPVSTLCAEARETA 
RRLLTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPR]jAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMCI WASHGAS S VARASVREPQGNKSPRMNTKRAGECLCPRS 
CSFS AQDYDI FAP I LP VEKQRLRVTQE VRAGL.VL VLKI RPQTNS 
CILPLPHSTGS INSDHVPTK 


6148 


2te6 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDLDINVQIiXIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPIiALPVRT 
SYKAFSTRWNWNEWLKLPVKYPDLPRNAQVALTIWDVYGPGKAV 
PVGGTTVS LFGKYGMFRQGMHDLKVWPN CRSQMDQKPTKTPGRT 
SSTLSEDQMSRLAKLTKAHRQGHMVKVDWLDRLTFREIEMINES 
VKRS SNFM YLMGG FRCVKCDD KE YGI VY YE KDGDES S P I LTS FE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKN1VSYPPSKPPTYEEQDLVMEFRYYLTNQDKALT KILTS VI W 
DL PQGAKQALALLG KWK PMD VEDS LELLS SH YTNPT VRR YAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLC TFLI SRAS KNSTLAN YLYW YVI VECEDQDTQQRDP K 
THEMYLNVMRRFSOALLKGDKSVRVMRSLXJU^QO/TFVDRLVHLM 
KAVQRESGNRKXKNERLQALLGDKrEKMNLSDVELIPLPLSPQVK 
I RGI I PETATLFKSALMPAQLFFKTEDGGKYPVI FKHGDDLRQD 
QLILOIISLMDKLLRJCENLDLKLTPYKVLATSTKHGFMQFIQSV 
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Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aepartic Acid, E=a 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 

H=His t i d 4 ne* T-TcaI oit^*^nA v r 

■* -«*ouxuihui jL-iBoieucLne, K.=bysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R»Arginlne, 
S=Serine, TeThreonine, V^Valine, 
WaTryptophan, Y=Tyxoeine, X»Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








P VAEVLDTEGS I QNF FRKYAPS ENG PNG I SAEVMDTYVKSCAGY 
CVITYI LGVGDRHLDNLLLTKTGKLFH I DFG Y I LGRDPKPL P P P 
MKLNKEMVEGMGGTQSEQYQEFRKQCYTAFIJILRRYSNLILNLF 
SLMVDANIPDIALEPDKTVKKVQDKFRLDLSDEEAVHYMQSLID 


6149 


l 


1413 


R VD PR VRENGTANP I KNGKTS PAS KDQRTGKKTS VQGQ VQ KGND 
ESESDFESDPPSPKSSEEEEQDDEEVliQGEQGDFNDDDTEPENIi 
GHR P LLMD S ED EEEEE KHS SD S D YEQAKAKYS DMSS VYRDRS GS 
GPTQDLNTIIjLTSAQLSSDVAVETPKQEE'DVFGAVPFFAVRAQQ 
PQQEKNEKNLPQHRFPAAGLBQEEFDVFTKAPFSKKVNVQECHA 
VGPEAHTIPGYPKSVDVPGSTPFQPFLTSTSKSESNEDLPGLVP 
FDEITGSQQQKVKQRSLQKLSSRQRRTKQDMSK3NGKRHHGTPT 
S TKKTLKP TYRTPERARRHKKVGRRDSQS SNE FLTI SDS KEN I S 
VALTDGKDRGNVLQPEESLIiDPFGAKPFHSPD\LSWHPP\HQGL 
S \D IRADHNT \VLPGR\ PRQNSLHGSFHSADVLKMDDFGAVP /F 
LTELWQS ITPHQSQQSQPV\ELDPFGAAPF PS KQ 


6150 


372 


37 


MSNI KKY 1 1 D YDWKAS I £ t E IDHDVMTBEKLHQ INNF WSDSE YR 
LNKHGS VLNAVL IMLAQHALLI AI S SDLNAYGWCE FD WNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF 


6151 


1555 


521 


DSNQQS VSGTAAS TLLHS FKATI YYQGTGHVQQF YGVTS FYS QT ' 
TPP I VQSYAQPSLQYIQGQQI FTAHPQGVWQPAAAVTl'I VAPG 
QPQPI^PSEIWVTNNXiIjDLPPPSPPKPKTIVLPPNWKTARDPEG 

ki yyyhvitrqtqwdp ptwes pgddasleheaemdlgtpt yden 
pmk\askkpktaeaj3tsseiiakkskevfrkemsqfivqclnpyr 
kpdckvg\ritttedfkhlarklthgvmnkelkycknpe\dlec 

NENVKHKTKEYI KKYMQKFGAVYKPKEDTEFRVTVGPGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVUjMIHHPGBFNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCGAVPAKCP" 

PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 

GPSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 

KCEHHCPCDPKTGNCSVSRVKQCLQPPEATLRAGBLSFFTRTAW 

IALTLAIiAFLLLISTAANLSLLLSRAERNRRLHGDYAYHPLQEM 

NGEPLAAEKEQPGGAHNPFKD 


6153 


2 


3368 


grvgarspgrayalllllicfnvgsglhlqvlstrnenkllpkh 
phlvrqkraw i tap val leg edlskkn p iak ihs dlaeerglki 

TYKYTGKGITE P P FG I FVFN KDTGELNVTS I LDRE ETPPFLLTG 
YALDARGNNVEKPLELRIKVLDINDNEPVFTQDVFVGSVEELSA 
AHTLVMKINATDADE PNTLNS KIS YR I VSLEPAY P P VFYLNKDT 
GE I YTTS VTIJJREEHSS YTLTVEARDGNGEVTDKPVKQAQVQ IR 
II^VNDNIPVVENKVLEGMVEENQVNVEVTRIKVFDADEIGSDN 
WLANFTFASGNEGGYFHIETDAQTNEG I VTLIKEVD YEEMKNLD 
FS VI VAKKAAFHKS 1 RSKYKPTP IPI KVKVKNVKEGIHFKSS VI 
S I YVSESMDRSSKGQI IGNFQAFDEDTGLPAHARYVKLEDRDNW 
ISVDSVTSEIKLAKLPDFESRYVQNGTYTVKIVAISEDYPRKTI 
TGTVLINVEDINI)NCPTLIBPVQTICHDAEYVNVTAEDLDGHPN 
SGPFSFSVIDKPPGMAJSKWKIARQESTSVLLQQSEKKLGRSEIQ 
FL I SDNQGFSCPEKQVLTLTVCE VLHGS \ GCREAQHDS YVGLGP 
AAIALM 1 LAFLLLLLVpLLLLMCHCGKGAKGFTP I PGTIEMLHP 
WNNEGAPPEDKWPSFLPVDQGGSLVGRNGVGGMAKBATMKGSS 
SAS I VKGQHEMSEMDGRWEEHRSLLSGRATQFTGATGAI \MTTE 
TTITARATGAS RDVAGAQAAAVAIiNEE FLKNYFTDKAAS YTEED 
ENHTAKOCLLVYSQE ETE SLNAS IGCCS FZ BGELDDRFLDDLGL 
KFKTLAEVCLGQKID INKEI EQRQKPATETSMNTASHSLCEQTM 
VNSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSROAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVIIjGPSQPQSLIV 
TERVYAPASTLVDQPYANEGTVVVTERVIQPHGGGSNPLEGTQH | 
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Amino acid segment, containing signal peptide 
(A=Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=£top 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LQDVP Y VMVRERES FLAP SS GVQ PTliAMPN I AVGQNVTVTHRVIi 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPIjPDFGLESSGH 
SNST I TTS STRVTKHS TVQHS YS 


6154 


3660 


2146 


KKKTKMKNTLQXTVNFGAWPKPTI SDKSHLLQMVS KLDLTDAKN 
SDTAHX KS 1 BI TS I LNG LQASES S AEDSEQEDERGAQDMDNNGK 
EESKIDHLTNNRNDLISKEEQNSSSLLEENKVHADLVISKPVSK 
SPERLRKDIEVLSEDTDYEEDEVTKKRKDVKKDTTDKSSKPQIK 
RGKRRYCNTEECLKTGSPGKKEBKAKNKESLCMENSSNSSSDED 
EEETXAKMTPTKKYNGLEEKRXSLRTTGFYSGti'^ k'Vivwirp T VT.T. 

NNSDERLQNSRAKDRKDVWSSIQGQWPKKTLKELFSDSDTEAAA 
SPPHPAPEEGVAEESLQTVABEESCSPSVELEKPPPVNVDSKPI 
EEKTVEVNDRKAEFPSSGSNFSA* I PLPYLHLNRLHQSL * QKGS 
RQQSSVTVSBPLAPNQEEVRSIKSETDSTIEVDSVAGEIiQDLQS 
ERE*LASRF*CQCELEQ**SARTRTS*KSLYRSEKSERCSGRRK 
F I KKAEKKP * SNSGKQQKEG K 


6155 


669 


121 


HLLPELRGKSWITMKYVFYLGVLAGTFFFADSSVQKEDPApyLV 
YLKSHFNPCVGVLIKPSWVLAPAHCYLPNLKVMLGNFKSRVRDG 
TBQT INPIQI VRYWNYSHSAPQDDLMLIKLAKPAMLNPKVOALN 
P\ PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMS DRE 
CQKTEQGKSHRNS LCVKFVKVFSR I FGEVAVATVI CKJDKIiQGIE 
VGHFMGGDVG I YTNVYKYVS W I ENTAKDK 


6156 


5725 


3984 


GTSTVTMATKKHFSI ILNLIiG^LKKDNQDTRKLIjMTWAIiEVAV 
v [*iru\.o £» i iJWUc LbForn J\r L. Jj IiAL) TbVE DVN I CLQACS S LH 
ALSSSLPDDLLQRCVDVCRVQLVHRGTCIRQAFGKLLKS I PLGV i 
FLSNNNHTE IQE I S LALRS HMSKAPSNTFHPQDFS D/VI S F I LY ! 

gnshrtgkdkwlerlfyscqrldkrdqstiprnllktdavlwqw i 

A I WE AAQ FTVLS KLRTPLGRAQDTFQTXEGX I RSLAGHTLNPDQ : 

ALTS PPKVIRTFLYTNRQTCQDWLTRIRLS IMRVGLLAGQPAVT 
VRHGFDLLTEMKTTSLSQGNELEVSIMM\A/EALCELHCPEAIQG 
IAVWSSSIVGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCC IS S FDKS VLTLASAGCKSASLKHCIiNGESRKS VLSKPTDSS 
PEVINYLGNKACEC YISTADWAAVQEWQNAIHDLKKSTS STS LN 
LKADFNYIKSLSS FESGKFVECTEQLELLPGENINLLAGGSKEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKIEQKYDADLENKLVDWIILQCAEDIEH 
PPPGRAHFQKWLMDGTVLCK IiINSLYP PGQE P 1 PKISESKMAFK 
QMEQISQFLKAAETYGVRTTDIFQTVDLWEGKDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGMTGYGMPRQIM*DAASCP 


6158 


441 


1482 


LGSLI VLSLHCKVI FSSQS LERAMKEKAVDLVP ILAQNPGLAQN 
P ILEGKDHNQHTG VDP 1 1 DHVQDRKTD /S RSKS PHKKRS KSRER 
RKSRSRSHSRDKRKDTREKI KEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKEREKDRBKDKEKDREREREKEHEKDRDKEKEKE 
QDKEKEREKDRSKEIDEKRKKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRS S SRS PRTSKTI KRKSSHS PSPRSRNKKDKKRE KERD 
HISERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
SSVSKEVDDKDAPRTEENKIQHNGNCQLNEENLSTKTEAV 


~ 6159 


53 • 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSrjPVGSCV 
1 1 TGTP I LTFVKDPQLEVNFYTGMDEDS D I AFQ FRLHFGHPAIM 
NSCVFGI WR YEE KC Y YLP FEDGKPFELC I YVRHKE YKVMVNGQR 
I YNFAHRF P PAS VKMLQVFRD I SLTRVLISD * GRC VRI TAVQE F 
DVS VS CDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP* F*KVADAQPTBSEKEI YNQVNWLKDAEGILEDLQS " 
YRGAGHEIREAIQHPADEKLQEKAWGAWPLVGKLKKFYEFSQR 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C»Cysteine, D=Aspartic Acid/ E*» 
Glutamic Acid, P-Phenyl alanine, G=Glycine, 
HaHistidine, I~Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *~Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLIiGALTSTPYSPTQHLEREQALAKQFAJSILHFTIiRFD 
ELKMTNPAIQNDFS Y YRRTLS RMR I NNVPAEGENB VNNE LANRM 
SLF YAEAT PML KTLS DATTKFVS ENXNT,P 7 RNfTTnr'T.Q TMa q \rn 
RVMLETPE YRSRFTNEETVS FCLRVMVGVI ILYDKVHPVGAFAK 
TSKIDMKGCIKVLKDQPPNSVEGLLCUuUlYTTKHUn)ETTSKQI 
KSMLQ*QLLTLVNKG 


6161 


455 


1569 


PVSGSESSLRRAWASIIiRLMIiGPRVAVSILCEDGISH*LLEKH* 

TGQLHLLMVNETRPRLQKVASWQAHQFEAWIAAFNYWHPEIVYS 
GGDDGLLRGWDTRVPGKFliFTSKRHTMGVCS I QSS PHREHILAT 
GS YDEHI LLWDTRNM KQPLADTPVQGGVPJRI KWHPFHHHLLLAA 
CMHSG FKI liNCQKAMEERQEAT VLTSHTLPDSIiVYGAJDWS WLLF 
RSLQRAPSMSFPSNLGTKTADLKGASELPTPCHECRBDNDGEGH 
ARPQSGMKP LTEGMRKNGTWI*QATAATTRDCGVNPEEADS AFSL 
LATCS FYDH AT .Hr .WF W Rrt W 


" 6162 


1 


586 


RTIHATGRAGAS PMHRL I WRXiAEANKQHVRCQKCliEFGHWTYE 
CTGKRKYLHRPSRTAELKKALKEKENRLLLOQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKEIELLHSYWTDGLKTLM 


6163 


1081 


785 


RIRSTTEGCAVRI/HPTQNTGKARIMILIjSVSIjGRHWAFTYKFFIi 

TPVV7VFPFPFFHRKE*VMQKNPMKSREDEWMEKLNNI^^ 

MNRLI MNYLVTEGFKEAAEKFRMESGIEPSVDIiETLDERI KIRE 

MILKGQ XQEAIAIi INS LHP EXiIiDTNR YLYFHLQQQHLIEL I RQR 

ETEAALEFAQTQIjAJEQGEESRECLTEMERTLALLAFDSPEESPF 

GDLLHTMQRQKWSEVNQAVLDYENRESTPKIiAXLLKLLLWAQN 

ELDQKKVKYPKMTDLSKGVIEEPK 


6164 


90 

• 


406 


PCQS PGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTILSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLS TG I PXE WRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDS M GI Q I VKDLHRTGCS S YCGQEAEQDRWLKRVLLAYAH 
WNKTVGYCQGFNI LAAIi ILEVMEGNEGDALKIM I YLIDKVL P ES 
YFVNNLRALS VDMAV FRDLLRMKLPELSQHLDTLQRTANKES GO 
G YEP PLTNVFTMOWFLTTiFAT CL»PNfiTVT»Tf T Wn QvPirprcT? ttt 

RVSIaAIWAKTjGEQIECGETADEFYSTMGRLTOEMLENDU^QSHB 
LMQTVYSMAPFPFPQIAELREK YTYNI TPPPATVKPTS VSGRHS 
KARDSDEENDPDDEDAVVNAVGCLGPFSGFLAPEIiQKYQKQIKE 

pneeqsi^siwiaelspgainscr^eyhaafnsmi^ermttdin 
alkrqysrikkkqqqqvhqvyiradkgpvtsilpsqvnsspvin 
hlllg kkmkm1t9raaknavih i pghtggkis pvp yedlxtklns 
pwrthirvhkknmprtkshpgcgdtvglideqneasktnglgaa 

E AFPS GCTATAGREGS S PEG S TRRTI EGQS PEPVFGDADVDVS A 
VQAKLGALELNQRDAAAETELRVHPP CQRHCPEPPSAPEENKAT 
S KAPQGSNS KTP 1 FS P F PS VKPLRKSATARNLGLYGP TERTPT V 
HFPQMSRSFS KPGGGNSGP* KMVFSSGTMLSRQLPGYPQEYQRN 
GGERFG 


6165 


90 


405 


PCQS PGRS RMRQDKLTGSLRRGGRCLKRQGGGVGT I LSN VLKKR~" 

SCISRTAPRLLCTLEPGVOTKLKFTLEPSLGQNGFQQWYDALKA 

VARLSTG I PKEWRRKVWLTIiADHYT^S IAIDWDKTMRFTFNERS 

N PDDDS MG I Q I VKDLHRTGCS S YCGQE AEQDRWLKRVLIAYAR 

WNKWGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVL 

YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLORTAWKE 

GYEPPLTNVFTMQWFIiTLFATCLPNQTVLKIWDSVFFEGSEIIL 

RVSLAJWAKLGEQIECCETAJDEFYSTMGRLTQEMLENDLLQSHE 

LMQTVYSMAPFPFPQLAELREKYl^ITPFPATVKPTSVSGRHS 

KARDSDEBNDPDDEDAWNAVGCLGPFSGFLAPELQKYQKQIKB 
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Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D=Aspartic Acid, E=» ' 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine f l» Is ©leucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R»Arginine, 
S=Serine, TsThreonine, VaValine, 
WaTryptophan, YwTyroeine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








PNEEQS LRSNN I AELS PGAINSCRS E YHAAFNS MMMKRMTTD I N 
ALKRQ YSRI KKKQQQQVHQVY I RAD KGPVTS I LPSQVNSS P VIN 
HLLLGKKMKMTNRAAKNAVIHI PGHTGGKIS PVPYEDLKTKLNS 
PWRTH I RVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTNGLGAA 
EAF PS GCTATAGREGSS P EGSTRRTI EGQSPEPVFGDADVDVS A 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPE P PSAPEENKAT 
SKAPQGSNSKTPI PS PPPS VKPLRKSATARNLGLYGPTERTPTV 
HPPQMSRSFSKPGGGNSGP * KMVFSSGTMLSRQLPGYPQEYQRN 
GGERFG 


6166 


2 


1206 


HKLVjRTVAMAGAEWKSLEECIJEKHLPLPDLQEVKRVLYGKELRK- 
LDLPREAPEAASREDFELQGYAFEAAEEQLRRPRIVHVGLVQNR 
I PLPANAP VAEQVSALHRRI KAI VE VAAMCGVN I ICPQEAWTMP 
FAFCTREKLPWTE FAESAEDG PTTRFCQKLAKNHDMVWS PILE 
RDS EHGD VLWNTAWISNS GAVLG KTRKNH I PRVGDFNESTYYM 
EGNLGHP VFQTQFGRIAVNICYGRHHPLNWLMYS INGAE 1 1 FN P 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFGY FYGSS YVAAPDSS RTPGLSRSRDGLLVAKLDL 
KTLCQQVNDVWNFKMTGRYEMYARELAEAVKSNYSPTIVKE* PAS 
VPALG 


" 616 7- ■ ■ 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEALCACEEYLSN 
LAHMD I DKDLEAPL YLT PEGWSLFLQRYYQ WHEGAELRHLJDTO 
VQRCED I LQQLQAWPQ I DMEGDRN I W I VKPGAKS RGRG I MCMD 
HLEEMLKL VNGNP WM KDG KWWQ KYIERPLLI FGT KFDLRQWF 
LVTDWNPLTVWFYRDS Y I R FSTQPFS LKNLDK* APLYLTPEG WS 
LFLQR YYQ WHEGAE LRHLDTQ VQRCED I LQQLQAWPQ I DMEG 
DRNIW I VKPGAKS RGRG I MCMDHLEEMLKLVNGNPVVWKDGKWV 
VQKYI BRPLL I FGTKPDLRQWFLVTDWNPLTVWFYRDSY I RFST 
QPPS LKNLDK 


6168 


64 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFI KAVTHQ VKFGQQNPRQVAQSEAEKKLKKDDKKKE LQ 
ELNELFKP WAAQKI S KGADP KS WCAFFKQGQCTKGDKCKFSH 
DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEVVNKKHG 
EAEKKKPKTQI VCKHFLEAI ENNKYGWPWVCPGGGD ICM YRHAL 
PPGFVLKKKKKKKKKEDEISL*DLIERERSALGPNVTKITLESP 
IAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEBADDTRYTQGTGGDEVDDS VSVNDIDLS L YI PRDVD 
ETGI TVAS LER FS TYTSDXDENKLS EA5 GGRAENGERS DLE SDN 
EREGTBNGAIDAVPVDENLFTGEDLDELEEBLNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAV I TRI I KEALPDGVN I S KfiARSAI S R 
AAS VFVL YATSCANNFAMKG KRKTLNAS DVLSAMEEME FQRFVT 
PLKEALEAYRREQKGKXEASEQKKKDKDKKTDSEEQDKSRDEDN 
DEDEBRLEEEEQNEEEEVDN* KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVFI TGASRGIGKAIALKAAKDGANIVIA 

VEKAI KKFGG ID I LVNNASAI SLTtHliDTPTKRIiDLMMNVNTRG 
TYUVSKACIPYLKKSKVAHIPNISPPLNLNPVWFKQHCGRW*VV 
G * GDGLCL I CFELNLCMSDV I T I CT 


6171 


382 


941 


HFMQSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFS KTTSEEDQS I QE P KEANSMTAQKQKK*GLRG S RRRHAN 
SGGDI FGDS FAAY F P R VL KQ VHQALS LS QEAVSVMDS MVRD I LD 
R I ATEAGHLAHYSKC VTITSRD IRMAVCLLLPGKMG3CLAE5QGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARLRREYLYRKAREEAQR 
SAQERKERLRRALEENRLI PTELRREALALQGSLEFDDAGGEGV 
TSHVDDE YRWAGVED PKVM ITTSRDPS SRLKMFAKELKLVPPGA 
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ID 
NO: 


Predicted 
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nucleotide 

location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence ' 


Predicted end 
nucleotide 
location 
curresponuiny 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«*Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
WaTryptophan, YoTyroeine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








QRMNRGRHE VGALVRACKANG VTDLLVVHEHRGTP VGL I VS HI*P 
FGPTAYFTLCNVVMRHDIPDLGTMSEAKPHLITHGFSSRLGKRV 
SDILRYLFPVPKDDSHRVITFANQDDYISFRHHVYKKTDHRNVE 
LTEVGPRFELKLYMI RLGTLEQEATADVEWRWHPYTNTARKRVF 
LSTE*AAPRPLGQLL 


6173 


3 


2B8 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
LQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
R I HTGER P Y VCPLOG KAFNHST VLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQRIHTGEKPYTCSECGKAFSDRSVLIQHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHIiIQHQKVHRKL* PTCVLSVGSALAGVPTSFS ISVSTLERSP 
MCAVYVGR PS ARAQS LVNTGQ FTQ VRSPMSVMS VEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGIiGNPGLPGTRHSVGMAVLGQLARRLGVABSWT 
RDRHCAADLALAPLGDAQLVLLR PRRLMNANGRS VARAAELFGL 
TAEEVYLVHDELDKPLGRLALKLGGSARGHNGVRSCISCLNSNA 
MPRLR VGIGRPAHPEAVQAHVLG CFSPAEQE LLPLLLDRATDL I 
LDHIRERSQGPSLGP*H*WFSKKA 


6175 1 


2204 


334 


RY FRADPRSRSGQPRAEGLGAFAEGPLRAMAAP VKGNRKQS TEG 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 
AFLVSLYKFMKERHTP I ER VPHLG FKQ INLWKI YKAVEKLGAYE 
LVTGRRLWXNVYNELGGS PGSTSGATCTRRH Y * RLVLP YVRHLK 
GEDD KPLPTS KPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
MPGKTKADAADPAPLPSQEPPRNSTEC2QGLASGSSVSFVGASGC 
PEAYKRLLSS FYCKGTHG IMSPLAKKKLLAQVS KVEALQ CQEEG 
CRHGAEPQAS PAVHLPES PQSPKGLTENSRHRLTPQEGLQAPGG 
SLREEAQAGPCPAAP I FKGCFYTHPTEVLKPVSQHPRDFFSRLK 
DGVLLG PPGKEGLS VKE PQLVWGGDANRPSAFHKGGSRKG I IiYP 
KPKACWVSPMAKVPAESPTLPPTFPSSPGLGSKRSLEEEGAAHS 
GKRLRAVS PFL KEADAKKCGAKPAGSGLVS CLLG PALGPVP PEA 
YRGTMLHCPLN FTGTPG PLKGQAALPFSPLVT PAFPAHFLA.TAG 
PSPMAAGLMHFPPTS FDSALRHRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIGQI IGASGFSESSLFCKWGIHTGAAWKLLS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDSFGRCQLAGYGFCHVPSS PGTHQLACPTWRPLGSWREQLAR 
AFVGGGPQLLHGDT I YSGADR YRLHTAAGGTVHLE IGLLLRNFD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VP I ESLVG KVHNFPLI AFYCCE KGKRQPHKS LHDRC FGEALD PN 
CSHCYLDQI KRSDFLGFSGYS PHFVAISTNSEHKMQPSSMQQAL 
PS Q* P YWTDPR PALVPCCS HRPDVHRSRPGPGL PGTSGCS DRP P 
VCPI 


6178 


1027 


254 


STQRGG I KG VARAAS LVGRRRAGTGMALLLCIiVCliTAALAHGCL 
HCHSNFSKKFSF YRHHVNFKS WWVGDI PVSGALLTDWSDDTMKE 
LH LA I PARI TREKLDQVATAVYQMMDQL YQGKM YFPGYFPNEIiR 
NI FREQVHLIQNAI I ESRIDCQHRCGI FQYETISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVLPALRCLEPPHLANLSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRIiWDWVPLACRSFSLGVPRLIGIRLTL 
PPPKVVDRWNEKRAMFGVYDNIGILGNFEKHPKELIRGPIWLRG 
WKGNEU2RCIRKRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR*KRKLRTSEKAHLSPWRRETVLFPVRKRLCIFSVIKWGFFGI 


6180 


156 


1833 


DHHI LKAA5 TTHVCARGNI FAI PNTRCLE C*ATATPS SLECQN * 
SHLSLCPLPATTSGLTPNSMI PEKERQNIAERLLRVMCADLGAL 
S WS GKEFLKLAO^TLVDSGAR YGAFS VTE ILGNFNTIiALKHLPR 
MYNQVKVKVTCALGSNACLGIGVTCHSQSVGPDSCYILTAYQAE 
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Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K*Lysine, 
Ii^Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








GNHIKSyVLGVKGADIRDSGDIiVHHWVQWVLSEFVMSEIRTVYV 
TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 
VIELLNVCEDLAGSTGLAKBTFGSLEETSPPPCWNSVTDSLLLV 
HERYEQICEFYSRAKKMNLIQSLNKHLLSNLAAILTPVKQAVIE 
LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 
KE^KVHPAHKVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV' 
KBSWAEBADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 
PLFQATPDLF^YWSCVTQKHTKLAKLAFWLLAVPAVGARSGCVN 
MCEQALLIKRRRLLSPEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPtiAiPAWLQPRYRKNAYLFI 
YYI*IQFCGHSWIFOTMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSLLELLH I YVGIESNHLLPRFLQLTERI I ILFWITSQEEVQE 
KYWCVLFVFWNLliDMVRYTYSMLSVIGISYAVLTWLSQTLWMP 
I YP LCVLAEAFAI YQS LP YF ES FGT YS TKLPFDLS I Y FP YVLK I 
YLMMLFI GM YFTYSHL YSERRDI LG I F P I XKKKM*STAFQCDTR 
KDRLWIQCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS*IDYQLNTLUCEFQL^E^TKLRYLT<iSLIEDMAAAYFPDCI " 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVPSERlATQKlLSVLGECn^DIIFGPGCVGVQKILNARCPLVR 
FSHQASGFQCDLTTNNRIALTSSELLYIYGA1jDSRVRAI.VFSVR 
CWARAH3LTSS I PGAWI TNFSLTMM7I FFLQRRSPP ILPTLDSL 
KTLADAEDKCVIEGNNCTFVRDLSRIKPSQNTETLELLLKEFFE 
YFGNFAFDKNS INI RQGREQNKPDSS PLY I QNP FETS LN I S KNV 
SQSQLQKFVDLARESAW ILQQEDTDRPS ISSNRPWGLVSLLLPS 
APNRKS FTKKKSNKFAI ETVKNLLES LKGNRTENFTKTSG KRT I 
STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCX:CKPVCSWVPACSCTSCX3SCGGSKGGCGSCGGSKGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCLVXAFLM 
VP 


6184 


1 


21*1 


IVTVREEDGAPAVAPPGWVSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRI ITS 
ELYRSI/5DVLRDVT3AKALVRSDFLLVYGDVISNINITRALEEHR 
LRR KL * KNVS VMTM I FKE S S PSHPTRCHEDNWVAVDS TTNRVL 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVA 
QLFTDN FDYQTRDD FVRGLL VNEE I LGNQIHMHVTAKE YGARVS 
NLHMYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRG 
PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NWLDQT YLWQG VRVAAGAQ I HQSLLCDNAEVKERVTLKPRS VL 
TSQVWGPNITLPEGSVISLHPPDABEDEDDGEFSDDSGADQEK 
DKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELQQNLWGLKI 
WJiM^ao&d&oJ&u^P^oiSCiJriJ&Kljuo r\Ui t lUUxj\v t UwisvXiGTLQR 
GXEENISCDNLVLBINSLKYAYNISLKEVMQVLSHWLEFPLQQ 
MDSPLDSSRYCALLLPLLKAWS PVFRNYI KRAADHLEALAAI ED 
FFLEHEALG ISMAKVLMAFYQLE I LAEETI L5WFSQRDTTD KGQ 
OLRKNQQLQRFIQWLKEAEEESSEDD 


618* 


791 


44 


P CTS C VLWATLHLPASTRKAPQAECGM I S I TEWQKIGVGITG FG 
IFFILFGTLXiYFDSVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 
REEWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 
GCQEAEMQTP RRLGWGWYHTLTLYLWEEK 


6186 


5*9 


238 


VYQIDSSNTNTHC^ERNRKL^WKLCHAQSRLDVNGL^KIviA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
<A-Alanine, C»Cysteine, D»Aspartic Acid, E* 
Glutamic Acid, Phenylalanine , G=Glycine, 
HeHistidine, Ialsoleucine, x=I*ysine, 
LsLeucine, M*Methionine, N=Asparagine, 
P=*Proline, Q=Glutamine, R=Arginine, 
S*serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X«Unknown # *«Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








KE RKVKNKVKNKADTEE VFNNS PTNQE KMPTSAI I»PD?SGS VI S 
NIRNQMETIiHSQPHQEENLCFENS FSLINLLPINAVEPTSSQQ I 
PNRETSEANKERRKMTSKSSESNIYSPLTSFITADSELHDIIKD 
IiEDCLMVGLHTCGDIiAPimiRIPTSNSEIKGVCSVGCCYHLIiSE 
EFENQHKERTQEKWGPPMCHYLKEERWCCGRNARMSACLALERV 
AAGQGL PTES LFYRAVLQD I IKDCYGITKCDRHVGKI YSKCS S P 
LDYVRRSLKKIjGLDESKLPEKI IMNYYEKYKPRMNELKAFNMLK 
WLAPC IETLI LLD RLCYLKEQED I AWSALVKLFD PVKS PRCYA 
VIALKKQQ*FPLKQIIRC1SL*DSAGCMEVSVGDGGPALRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIE PRPGRLPELEATRPHMEPKASCPA " " 
AAPLMERKPHVLVGVTGSVAALKLPLLVSKLLDIPGIiEVAWTT 
ERAKHFYS PQDIPVTLYSDADEWEMWKSRSDPVLHIDLRRWADL 
LLVAPIJJANTLGKVASGICDNLLTCVMRAV7DRSKPLLFCPAMNT 
AMWEHP ITAQQVDQLKAFGYVEIPCVAKKIiVCGDEGLGAMAEVG 
T1VDICVKEVLFQHSGFQQS*PGISVMGVPLYSEWVQAKSVKMDV 
GKIGGYPHljLNGGPALS LPRGQACSRLNWTEGPGLS FFQPGEAA 
A 


6188 


238 


1534 


KGFVNAGPLMAELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
N IGWICI RCAGIHRNLGVHISRVKSVNIJXJWTQEQIQCMQEMG 
NGKANRLYEAYLPETFRRPQIDPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWPBKVKMPQKKEDPQLP 
RKSSPKSTAPVKDLLGLDAPVACSIANSKTSNTLEKDLDLLASV 
PS PS SSGSRKWGSMPTAGSAGSVP2NLNI*FPEPGS KS E E I GKK 
QLSKDSILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
PNS I MGSMMPPP VGMVAQPGASGMVAPMAM PAG YMGGMQASMMG 
VPNGMMTTQQAGYMAGMAAMPQTVYGVQPAQQLQWNLTQMTQQM 
AGMN PYGANGMMNYGQSMSGGNEQAANQTLS PQMWK 


6189 


1297 j 


793 


LGEPIiGDLCEL I PGDVQQLQMGE VH PGTGAQGSAAQSVAG3 VQI> 
TQLSHARQRPSCQGSQIilALDLQHMDISRQPRWQHVQPVARQVQ 
RAQQAQLABGVAVHLWAGDAVVAEVELl^EVGGGKVFAANACDL 
WQDHEGAJHAARQATGHALQRVIVQVRRVQPLEAL*RVPSGLPR 
RVRAFMILHNQITGIGREDFATTYFLEBLNLSYNRITSPQVHRD 
AFRKLRLLRSIJDLSGNRLHMLPPGLPRITVHVLKVKRNEIJUUJ^ 
GALAGMAQLREL YLTSNRljRSRAIiGPRAWVDLAHLQLLD I AGNQ 
LTEI PEGLPES LE YLYLQNNKIS AVPANAFDSTPNIiKG 1 FLR FN 
KLAVG S WDSAFRRLKHLQVLDIEGNLEFGDI 3 KORGRLGKEKE 
EE EEDE VEEEETR 


6190 


6* 


1309 


I LVGNVSFLLS FAE YVCNCS WGS IjNVNRCNQTTGQCE CRPG YQ 
GLHCE TCKEGFYLN YTSGLCQPCDCS PHGALS I PCNS SGKCQ CK 
VGVIGS I CDRCQDGYYGFSKNGCLPCQCNNRSASCDALTGACLN 
CQENS KGNHCEECKEGFYQS PDATKE CLRCPCS AVTSTGS CS I K 
SSELE PECDQCKDG YIG PNCNKCENG YYNPDS I CRKOQCHGHVY 
PVKTPKICKPESGECINCIiHNTTGPWCENCL*GYVHDIjEGNCIK 
KVIL PTPEGST ILVSNASLTTSVPTPVINSTFTPTTLQT I FS VS 
TSENSTSALADVS WTQFNI I ILTVI 1 1 VVVLtiMGFVGAVYM YRE 
YQNRKLNAPPWTI ELKEDNISFSSYHDS X PNAD VSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 


VNLCHGGLU-iLS THHI/3IXPSMH* LFFLMLS FPHLT PQQ PKCP S 
M I DW I KKI WY I YTME YYATIKRNEIMF FAGTWMEMEAI I LSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKWLSSLAVYAEDSEPESDGEAGIEAVGSAAEE"™ 
KGGLVS DAYGEDDFS RLGGDEDG YEEEEDENSRQS EDDDSETEK 
PBADDPKDNTEAE KRDPQELVAS FSERVRNMS PDE I K I P PEP PG 
RCSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEPRNPSIYEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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ID 
NO: 
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nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, OCyeteine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H-Histidine f I»Isoleucine, K=Lysine, 
L=Leucine, K»Methionine, N=Asparagine , 
P=Proline, Q=Giutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X= Unknown, *«stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKKRTKXEPVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 

DSAIPVTTIAQPTILTTTATIiPAVVTVTTSASGSKTTVISAVGT 
IVKKAKQ 


6193 


3 


950 


trgcgnkmagkknvlssiaWaBdsspesdgeagieavgsaaee 
kggl vsdaygedd fs rlggdedgye3eedensrqseddds ete k 

PEADDPKDNTEAEKRDPQELVASPSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKI QKLYBRKIKEGMDMNYI IQRKKEFRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTKI E PVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
DSAI ?VTI I AQ PTILTTTATLPAVVTVTTSASGS KTTVI SAVGT 
IVKKAKQ 


6194 


3 


950 


TRG CGNKMAGKKNVLS SLAVYAE Dfl E PESDGEAG I EAVGS AAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKIKEGMDMNYIIQRKKEFRNPSIYEKLI 
QF CAIDELGTNYPKDMFD PHGWSEDS Y YE ALAKAQKI EMDKLEK 
AXKERTK I E FVTGTKKGTTTNATS TTTTTAS TAVADAQKRKS KW 
DSAI PVTTIAQPTILTTTATLPAVVTVTTSASGS KTTVI SAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQS NMP KF YCD YCDT YLTHDS PSVRKTHCSGRKHKENVKD 
YYQKWMEEQAQSIiIDKTTAAFQQGKI PP TPFS APP PAGAM I PP P 
PSLPGPPRPGMMPAPHMGQPP^JPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGP PMMRP PARPMMVP TRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNIIjDNAEQVISNLEARNLGPRLTPLLQEEDSH" 
QRIiLMGIiMVSEL KDHFLRHLQGVEKKKIEQMVLDYI SKLLDLIC 
HI VBTNWRKKNLHS WVLHFNSRGSAAE FAVFHIMTRI LEATNS L 
FLPLPPGFHTLHTILGVQCLPLHNLUiCIDSGVItLLTETAVIRL 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDHPMSSNIISR 
NHVTRHjQNYKKQPRNSM INKSSFS VE FLPLNY F I E I LTDI ESS 
NQALYPFEGHDNVDAEFVEEAALKHTAMLLGL f 


6197 


3 


819 | 


ADPEGTE3AVMSRYTRPPWTSLFIRNVADATRPEDLRREFGRYG 
PIVDVYIPLDFYTRRPRGFAYVQFEDVRDAEDALYNLNRKWVCG 
RQIE IQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 
SRTPRRNFGSRGRSRS KSLQKRS KS I GKS QS SS PQKQTSSGTKS 
RSHGRHSDS IARSPCKSPKGYTNFETKVQTAKHSHFRSKSRSRS 
YRHKNSW 


6198 


111 


1912 


S EAAltS PS FISPACFLLRKLPALEDGTLPHPDTLGMN YEGARSE 
RENHAADDS EGGALDMCCKE R LPGLPQP IVMEALDEAEGLQDSQ 
REMPPP P P PSPPSDPAQKP PPRGAG SHS LTVRS S LCL FAASQ PL 
LACGVLWFSGYGHIWSC2NATNLVSSLI*TI»LKQLEPTAWLDSG , TVJ 
GVPSLLLVFLSGGLVLVTTLVWHIiLRTPPEPPTPLPPEDRRQSV 
SRQPS FT YSEWMEEKI EDDFJjDLDPVPETPVFDCVMDI KPEADP 
TS ZiT VKS MGLQERRGSNVSLTLDMCTPGCNEEGFG YLMS PRESS 
ARE YLLSASRVLQAEELHEKALDP FLLQAEFFE IPMNFVDPKE Y 
DIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINAMYIRG 
YGGEEKVYIATQGPI VSTVADFWRMVWQEHTPI I VKITNIEEMM 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAGIGRTGCF I A?3ICCQQIiRQEGWD I LKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQSPE 


6199 


144 


1211 


MARENGESSSSWKKQAEDIKKIFEFKETLGTGAFSEWLAEEKA 
TGKLFAVKCI PKKALKGKESS IENE I AVLRKIKHENI VALEDI Y 
BSPNHLYLVMQLVSGGELFDRIVEKGFYTEKDASTLIRQVLDAV 
YYLHRMGIVHRDLKPENLLYYSQDEESKIMISDFGLSKMEGKGD 
VMSTAOGTPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPP 
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to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=»Arginine, 
S=Serine, T»Threonine, V^Valine, 
WeTryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PYDENDSKLFEQILKAEYEFDSPYWDDISDSAKDFIRNLMEKDP 
NKRYTCEQAARHPWtAGDTALNKNIHESVSAQIRKNFAKSKWRQ 
AFNATAVVRH^KLHLGSSLDSSNASVSSSLSIASQKIXaSG'T'P 
HAL* 


6200 


702 


96 


L PE V PH SLRPR VKPHLCCAQPAVR VMARLPKLAVFDLD xTLWP F 
W VDTHVDPP FHKSSDGTVRDRRGQDVRLYPEVPEVLKRLQSLGV 
PGAAAS RTS EIEGANQLLELFDLFRYFVHRBI YPGS KI THFERL 
QQKTGI PFSQMI FFDDERRNI VDVSKLGVTCIHIQNGMNLQTLS 
QGLETFAKAQTGPLRSSLEES PFEA 


6201 


2809 


2383 


GQTPRVRWKMRRSLRAGKRRQtAGRKS^PPKVPIVIQDDSLPA 
GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSLAQREAEYABA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


! 426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE 
DG I FDSGNFEQFLREKVKVNGKTGNLGN WHI KRFKNKITWSE 
KQ FS KRYLKYLTKK YLKKNNLRD WLRWASDKETYELR YFQ IS Q 
DKDESESED 


6203 


419 


2550 


RC PR PPATAGAAASRPDRS PPSG I SGSEAAAGAGAAAPASQHPA ' 
TGTGAVQTEAMKQ I LGVI DKKLRNLEKKKGKLDD YQERMNKGER 
LNQDQLDAVSKYQEVTNNLEFAKELQRSFMALSQDIQKTIKKTA 
RREQLMREEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 
PI LSEEELS LLDEFYKLVDPERDMSIiRljNRnYRH A« T WT .ur\T.T t? 

GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEE3EAA 
SAPAVEDQVPEAE PE PAE E YTEQS E VESTE YVNRQFMAE TQFTS 
GEKEQVCEWTVETVBWNSLQQQPQAASPSVPEPHSLTPVAQAD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTIjDPAIVSAQPM 
NPTQNMDMPQLVCPP VHS ESRLAQPNQ VPVQP EATQVP LVS STS 
EGYTASQPLYQPSHATEQRPQKBPIDQIQATISLNTDQTTASSS 
LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFKMNAPVP 
PVNEPETLKQQNQYQASYNQSF5SQPHQVEQTELC?QEQLQTVVG 
TYHGSPDQSHQVTGNHQQPPQWGFPRSNQPYYNSRGVSRGGS 
RGARGLMNG YRG PANGFRGGYDGYRPS FSNTPNS G YTQS QFSAP 

RD YS GYQRDG YQQNFKRGSGQSGPRGAPRGRGGP PRPNRGMPQM 
NTQQVN 


6204 


2933 


787 


CTHNLISIJjGGRALIHFNRFLNLKIQEGEAHNIFCPAYDCFQLV 
PGDlIKSWSKEt4DKRYLQFDIKAFVENNPAIKWCPrPGCDRAV 

rltkqgsntsgsdtlsfpllrapavdcgkghlfcweclgeahep 
cdcqtwknwlqkitemkpeelvgvseayedaanclwlltnskpc 
ancks p iqknegcnhmqcakckydfcw i cleewkkhs fvhwevi 
yrctryeviqhveeqskemtveaekkhkrfqeldrfmhyytrfk 
nhehsyqleqrllktakekmeqlsralkbteggcpdttfiedav 

HVLLKTRRILKCSYPYGFFLEPKSTKKEIFEXiMQTDLEMVTEDL 
AQKVNRP YLRTPRHKI IKAACLVQQKRQEFLAS VARGVAPADSP 

eaprrsfaggtwdweylgfaspbeyaefqyrrrhrqrrrgdvhs 

LLSNPPDPDEPSESTIiDIPEGGSSSRRPGTSWSSASMSVLHSS 

slrdytpasrsbnqdslqalssldeddpnillaiqlslqesgia 
ldeetrdflsneasix^igtslpsrldsvprntdspraalssse 
llelgdslmrlgaendpfstdtlsskplsearsdfcpsssdpds 
agqdpnindmllgnimawfhdmnpqsialippatteisadsqlp 
cikdgsegvkdvelvlpedsmfedasvsegrgtqieenpleeni 
pgggkqhpqaw 


6205 


1 


1200 


rahrgkmalevgdmedgqi^dsdsdmtvapsdrplqlpk\n j ggd"'~ 

samrafqntatacapvshyravesvdsseesfsdsdddsclwkr 

xrqkcfnpppkpbpfqfgqssqkppvaggkkinniwgavlqeqn 

qdavatelgiu^egtidrsrqsetynyllakklrkesqehtkd 

ldkeldeymhggxkmgskeeengqghlkrkrpvkdrlgnrpemw 
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residue of 
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Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=*Histidine, I=Isoleucine , K=Lysine, 
£*°Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\spcssible nucleotide insertion) 








YKGRYEITAEDSQEKVADEISFRLQEPKKDLIAJRVVRIIGNKKA"" 
I3LLMETAEVEQNGGLPlrWGSRRKTPGGVFLNLLKNTPSISEE 
Q I KDI FYI ENQKE YENKKAARKRRTQVLG KKMKQA1 KSLNFQE D 
DDTSRETFASDTNEALASLDESQEGHAEAKLEAEEAIEVDHSHD 
LDIF 


6206 


10 


1442 


I ISERRERS CLHLVC IRCSCDWEMGS VLGLCSMASW I PCIiCGS 
AP CLLCRCC PSGNNS TVTRL I YALFLLVGVCVACVMLI PGMEEQ 
LNOPGFCENEKGWPCNILVGYKAVYRLCFX3LAMPYLLLSLLM 
I KVKSS SDPRAAVHNGFWFFKFAAAI AI I IGAFFI PEGTFTTVW 
F YVCT1AGAFCFILIQLVLL I DFAHSKNES WVEKMEEGNS RCWYA 
ALLSATALNYLLSLVAIVLFFVYYTHPASCSEWKAFISVNMLLC 
VGAS VMS I L P KIQES Q PRS GLLQ 3 S V I TVYTMYLTWSAMTNEP E 
TNCNPSLLS I IGYNTTSTVPKEGQSVQWWHAQGI IGLILFLLCV 
FYS3IRTSNNSQVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERDGVTYSYSFFHFMLFLASLYIMMTLTNWYRYEPSREM 
K S Q WTAVWV KI S 5 S W I GI VL YVWTLVAP LVLTNRDFD 


6207 


2924 


1471 


^VMAEAATPGrTAWSiAGAAAATAAAASPTPIPTVTAPSLGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATEI/TTKSSLAA 

ssshss i vgplvemntgeaes rnsnfatvgagsedwvnai efvp 
gqpycgrtapsctbaplqgsvtke3sekeqtavetkkqlcpyaa 
vgecrygencvylhgdscdmcglqvlhpmdaaqrsqh i ksciea 
hekdmelsfavqrskdmvogicmevvyekanpserrfgilsncn 

HTYC LKC I R KWRS AKQ FESK 1 1 KS CPE CRI TSNFVI PS E YWVEE 
KEEKQKLI LKYKEAMSNKACR YFDEGRGS CP FGGNC P YKHAYPD 
GRRESP^QKVGTSSRYRAQRRNHFWBLIEBRENSNPFDNDEEE 
VVTF3I/3EMLLMLLAAGGDDELTDSEDEHDLFHDELEDFYDLDL 


6208 


2924 


1471 


TVMAEAATPGTTATTSGAGAAAATAAAASPTPIPTVTAPSLGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
S S SLS S I VG P LVEMNTGE AES RNSNFATVGAGSEDW VNAI EFVP 
GQPYCGRTAPS CTEAPLQGSVTKEES EKEQTAVETKKQLCP YAA 
VGECR YGENC VYLHGDS CDMCGLQVLHPMDAAQRSQHIKS CIEA 
HEKDMELSFAVQRSKDMVCGI CME WYEKANPSERRFG ILSNCN 
HTYCLKCIRKWRSAKQFESKI I KSCPECRI TSNFVI PSE YWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGS CP FGGNCF YKHAYPD 
GRREBPQRQKVGTSSRYRAQRRNHFWELIEERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEBNSVTHHEVKCQGK" 
PLAG I YRKREE KRNAGNAVRSAMKS EEQKI KDARKGPLVP FPNQ 
KS B AAEP PKTPP SSCDSTNAAIAKQALKKP I KGKQ APRKKAQGK 
TQQNRKLTDFYP VRRS SRKS KABLQS EERKR I DEL I ESGKE EGM 
KIDLIDGKGRGVIATKQFSRGDFVVEYHGDLIEITDAKICREALY 
AQDP STGCYMYY FQ YLS KT YCVDATRETNRLGRL INHSKCGNCQ 

KH 


6210 


3761 


387 


IFGMSKLR^LEDSGSADFRRHFWLSPFTITVVLtiSACFVT 
SSIX5GTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDHVSCRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRI E 
I KFQGRWGTVCDDNFNIDHAS VTCRQIiE CGS AVS FSGSSN FGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICS KG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVllASKGFGHIWI^SVSOCySHEPAVWQCKHHE 
WGKHYCMiNEDAGVTC^DGSDLEIiRLRGGGSRC^GTVEVEIQRL 
I^KVCDRGWGLKEADWCRQLGCGSAIjKTSYQVYSKIQATNTWL 
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NO: 
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beginning 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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c or re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptiofe~~ 
(A«Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, GsGlycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\»possible nucleotide insertion) 








FLSS CNGNETS LWD CKNWQWGGLTCDH YE EA KITCSAHRE PRI*V 
GGDIPCSGRVEVKHGDTWGSI CDSDFSLEAASVLCRELQCGTVV 
S I LGGAH FGEGNGQ I WAEE FQ CEGHE SHLSLCPVAPRPEGTCSH 
S RDVGWCS RYTE I RLVNGKTPCEGRVELKTLGAWGSLCNSH WD 
IEDAHVLCQQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DS WDLSDAHVVCRQLGCGEAINATGSAH FGEGTG P I WLDEMKCN 
GKESR I WQCHSHGWGQQNCRHKEDAGVI CSE FMSLRLTSE AS RE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
ASLDKAMS I PMWVDNVQCPKG PDTLWQCPS S PWEKRLAS PSEET 
W I TCDNK I RLQEG PTS CSGR VE I WHGGS WGTVCDDS WDLDDAQV 
VCQQEX3CGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VG I LGWLLAI FVALFFLTKKRRQRQRLAVS SRGENL VHQ IQ YR 
EMNSCLNADDLDLMNS s GGHSEPH 


6211 

• 


3761 


387 


IKGMSKLRMVLLEDSGSADFRRHFVNLSPFTtTWLLLSACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQBEWGWCNNGWSMEAV 
S V I CNQLG CPTA1KAPG WANSSAGSGRIWMDIIV3CRGNESALWD 
CKHDGWGKHSNCraQQOAGVTCSDGSNLEMRIiTRGGNMCSGRlE 
I KFQG RWGTVCDDNFN I DHASVI CRQLECGS AVS FSGS S N FGEG 
SGPIWFDDLI CNGNES ALWNCKHQG WGKHNCDHAEDAG V I CS KG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAI GRVNAS KG FGHI WLDS VS CQGHE P AVWQ CKKHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADVVCRQLGCXSSALKrs YQVYS K IQATNTWL 
FLS S CNGNETS LWDCKNWQWGGLTCDHYEEAKITCSAHRBPRLV 
GGDIPCSGRVEVKHGDTWGSICDSDFSIjEAASVLCREIjQCGTW 

s i lggahfgegngqi waee fqceghe shlslcpvaprpegtcsh 
srdvgwcsryte i rlvngktpcegrvelktlgawgslcnshwd 
iedahvlcqqlkcgvalstpggarfgkgngqiwrhmfhctgteq 
hmgdcpvtalgaslcpseqvasvicsgnqsqtlsscnssslgpt 
rptipeesavaciesgqlrlvngggrcagrvbiyhegswgticd 
dswdlsdahwcrqlg cgeainatgs ah fgegtg p i wldemkcn 
gkesriwqchshgwgqqncrhkedagvicsefmslrltseasre 
acagrlevfyngawgtvgkssmsettvgvvcrqlgcadkgkil^p 
asldkamsipmwvdnvqcpkgpdtlwqcpsspwekrlaspseet 
witcdnkirlqegpts csgrve i whggswgtvcddswdlddaqv 
vcqqlgcgpalkafkeaefgqgtgpiwlnevkckgnesslwdcp 

iuCKW^noCCQlliUsiJAA VKLT1 PXSVQXTPQKATTGRSSRQSSFIA 
VGILGWLLAI FVAL FFLT KKRRQRQRLAVS SRGENLVHQ 1 Q YR 
EMNS CLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPF PACHE IGLGAEAGSG P PPAPAARESRSRAME EE AS S PGL 
GCS KPHLEKLTLGITR ILBSSPGVTEVTI IEKPPAERHMIS SWE 
QKNNCVMPEDVKNFYLMTKGFHMTWSVXLDEHI I PLGSMAINS I 
SKLTQLTQSSMYSLPNAPTIiADLEDDTHEASDDQPEKPHFDSRS 
VXFELDS CNGSGKVCLVYKSGKPALAEDTEI WFLDRALYWHFLT 
DTFTAY YRLLITHLGLPQWQYAFTS YG IS PQAKQRVSMYKP IT Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGP3TS3T3KS3SGSGNPTRK 


■ 6213 


1 


1134 


LKWELRf GGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTI I EKPPAERHMISSWE 
QKNNCVMPEDVKNFYX*MTNGFHMT>ISVKLDEHIIPLGSMAINSI 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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nucleotide 
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Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D^Aepartic Acid, E- 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
I»=Leucine, M=Methionine, NoAsparagine , 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T»Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








VIPEIiDSCNGSGKVCLVYKSGKPAXAEDTEIWPLDRALYWHFLT 
DTFTAYYRLL ITHLGL PQWQYAFTS YGISPQAXQR VSM YKPI T Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKTVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


heiapsairraari^ujparwqsraAAfVfvrgfrtgws FVGWV 

VLGTSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYN 
YRTYAVRR I RDAFRENKNVKDPVE IQTLVNKAKRDLGVIRRQVH 
IGQLYSTDKLI I ENRDMPRT 


S215 


2 


1849 


FVAGGPRGSGSAAETMPE IRVTPLGAGQDVGRS CI LVSlAGKNV 
MLDCGtf HMG FNDDRRFPD FS YlTQNGRtiTDFLDCVI ISHFHLDH 
CGALP YFS EMVGYDGP I YMTHPTQAI C PI LLEDYRK IAVDKKGE 
AN FFTSQM I KDCMKKWAVHLHQTVQVDDELE I XAYYAGHVLGA 
AMFQIKVGSESWYTGDYNMTPDRHLGAAWIDKCRPNLIjITEST 
YATTI RDSKRCRERDFLKKVHETVERGGKVLIPVFAIiGRAQELC 
ILLETFWERMNLKVPI YFSTGLTEKANHYYKLFI PWTNQKIRKT 
FVQRNMFEFKHIKAFDRAFADNPGPMWFATPGMLHAGQSLQIF 
RKWAGNEKNMVIMPGYCVQGTVGHKILSGQRKLEMEGRQVLEVK 
WVEYMSFSAHADlAKCIMQLVGQAEPESVLLVHGEAKKMEFLKQ 
KIEQELRVNCYMPANGETVTLPTSPS IPVGISLGLLKREMAQGL 
LPEAKKPRLLHGTLIMKDSNFRLVSSEQAIiKEIXSIiAEHQLRFTC 
RVHLHDTRKBQETALRWSHLKSVLKDHCVQHLPDGSVTVESVL 
LQAAAPSEDPGTKVLLVSWTYQDEELGSFLTSLLKKGLPQAPS 


6216 


11 


393 


QTTRPEPRNSAbRQSRSKMAWGVSSVSRLLGRSRPQLGRPMSS 
GAHGE EGS ARM WKTLTFFVAliPGVAVSMLNVYLKSHHGEHER PE 
FIAYPHLRIRTKPFPWGDGNHTLFHNPHVNPIiPTGYEDE 


6217 


9 


1178 


TRVGRGBSGLKMEVKPPPdRPQPDSGRRRRRRGbEGHDPKEPEQ 
LRKLF IGGLS FETTDDSLREHFEKWGTLTDCWMRDPQTKRSRG 
FGFVTYSCVEBVDAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 
HLTVKKI FVGGI KEDTEE YWLRDYFEKYGiaETIE VMEDRQSGK 
KRGPAFVTFDDHDTVDKIVVQKYHTINGHNCEVKKALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGS YGGGDGGYNGFGGDGGNYGGGPGYS SRGGYGGGGPG YG 
NQGGG YGGGGGYDG YNEGGNFGGGN YGGGGNYNDFGN YSGQQQS 
NYGPMKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


6218 


130S 


906 


S CERRGF IMADDLKRFLYKKLPS VEGLHAI WSDRDGVP VI KVA 
NDNAPEHALRPGFliSTFAliATDQGSKIiGLSKNXS 1 1 CYYNT YQV 
VQFN^LPLWSFIASSSANTGLIVSLEKEIAPLFEELRQVVEVS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMAS AGGEDCESPAPBADRPHQR PFL " 
IGVSGG TASGKS TVCBKIMELLGQNE VEQRQRKWILSQDR F YX 
VLTABQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
T YDFVTHS RLPETTWYPAD WLFEG I LVFYSQE I RDMFHLRLF 
YDTDSDVRLSRRVLRDVRRGRDLEQIIjTQYTTFVKPAFEEFCLP 
TKKYADVIIPRGVDNMVAINIjIVQHIQDILNGDICKWHRGGSNG 

rsykrtfsepgdhpgmltsgkrshlesssrph 


6220 


£.£. f 


764 


EQNISLEMSCTIEKAIiADAKALVERLRDHDDAAESLIEQTTALN " 
KRVEAMKQYQEEIQELNEVARHRPRSTLVMGIQQENRQIRELQQ 

enkelrts leehqsalelimskyreqmfrllmaskkddpgi imk 
lkeqhskidmvhrnksegffldasrhileapqhglerrhleanq 

NVH 


6221 


98 


916 


RW I WDLNP VS DGLELRPKYKG ILHCLTT I W KLDGIiRGLYQGVTP 

niwgaglswglyfvfynaiksyktegraerleateylvsaaeag 
amtlcitnplwvtktrlmlqydawnsphrqykgmfdtlvkiyk 
yegvrglykgfvpglfgtshgalqfmayellklkykqhinrlpe 
aqlstveyisvaalskifavaatypyqwrarlqdqhmfysgvi 
dvitktwrkegvggfykgiapnlirvtpaccitfwyenvshfl 
ldlrekrk 
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amino acid 
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Amino acid segment containing signal peptide " 
(A-Alanine, C-Cyeteine, D»Aspartic Acid, E° 
Glutamic Acid, F= Phenyl alanine, G*=Glycine, 
H«Histidine, 3>Isoleucine, Ksbysine, 
L»Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=sSerine, T-Threonine, V^Valine, 
WaTryptophan, Y^Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6222 


2 


2116 


MAREIjRAIJjLWGRRLRPIjIJIAPALAAVPGGKP I LCPRRTTAQLG 
PRRN PAWSLQAGRLFSTQTABDKEE PLHS I ISSTBS VQGSTS KH 
EFQABTKKLLDIVARSLYSEKEVFIJRBLISNASDALEKLRHKLV 
SDGQAL P EM E I HLQTNAE KGT I TIQDTG IGMTQEELVSNLGTIA 
RSGSKAPLDALQNQAEASSKI I GQFGVGFYSAFMVADRVB VYSR 
SAAPGSLGYQWLSDGSGVPEIAEASGVRTGTKII IHLKSDCKEP 
SSEARVRDWTKYSNFVSPPLYLNGRPJWTIX3AIWMMDPKDVRE 
WQHEEFYR YVAOAHDKPRYTLHYKTDAPLNIRS I F YVPDMKPSM 
FDVSREIiGSSVTUjYSRKVLIQTKATDILPKWLRFIRGWDSEDI 
PLNLSRELLOESALI RKLRDVLOnRTi T TTPP T IlrtQ V KTi Ji 1? K vb VP 
FED YGLFMR3GI VTATEQE VKED I AKLLRYES SALPSGQLTS LS 
E YASRMRAGTRNI YYLCAPNRHLAEHSPYYEAMKKKDTEVLFCF 
EQFDELTLLHLRBFDKKKLI S VETDIWDHYKEEKFEDRSPAAE 
CLS EKETE ELMAWM RNVLGS R VTNVXVTLRLDTHP AMVTVLBMG 
AARHFLRMQQLAKTQEERAQLLQPTLE I N PRHAL I KKLNQLRAS 
EPGLAQLLVDQIYENAMIAAGLVDDPRAMVGRLNELLVKALERH 


6223 


3 


715 


DAWARTMAGMVDFQDE EQVKS FLENME VECN YHC YHEKD"PDGCY 
RLVDYLEG IRKNFDEAAKVLKPNCEENQHSDS CYKLGAYYVTG K 
GGLTQDLKAAARCPLMACEKPGKKS I AACHNVGLLAHDGQVNED 
GQPDI/3KARDYYTRACDGGYTSSCFNLSAh3PLQGAPGFPKDMDL 
ACKYSM KACDLGH I WACANASRMY WjGDGVDKVEAKABVLKNRA 
Q QVHKEQQKG VQPLTFG 


6224 


1 


133 


LRTI SSMAWGPLLLTLLAHCTGSWAQSVLTQPPS VSGARI PHEk""" 


6225 


3259 


938 


LLS CHRLAI CKIjPFS VESRKTVMGPQGARRQAFLAFGDVTVDFT ' 
QKEWRIiLSPAQRALYREVTI*ENYSHLVSLGIIJISKPBLIRRLEQ 
GEVPWGEERRRRPGPC^IYAEHVLRPKNLGLAHQRCXX}LQFSD 
QSFQSDTABGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTE I DKVLKG I ENSR WGA FKCAERG QDFSRKMM VI IH 

RGFS LKANLLRHQRTHSGE KP FLCKVCGRG YTS KS YLTVHERTH 
TGEKPYECQECGRRFNDKSS YNKHLKAHSGE KP FVCKECGRG YT 
NKSYFWHKRIHSGEKPYRCQECGRGFSNKSHLITHQRTHSGEK 
PFACRQCKQS FS VKGSIiLRHQRTHSGEKPPVCKDCERS FSQKST 
LV YHQRTHS GEKPFVCRECGQG PI QK5TL VKHQ ITHSEE KPFVC 
KDCGRGFIQKSTFTLHQRTHSEEKPYGCRBCX3RRFRDKSSYNKH 
LRARI/3EKRFFCRDCGRGFTLKPNLTIHQRTHSGEKPFMCKQCE 
KS FS LKANLLRHQ WTHSGERPFNCIU>CGRGFI LKS TLLFHQKTH 
SGBKPFICSECX3CK3FIWKSNLVKHQLAHSGKQPFVCKECGRGFN 
WKGNLIiTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWRIHSKE 
KPFVCQECKRGYTSKSDLTVHERIHTGERPYECQECGRKFSNKS 
YYS KHLKRHLREKRFCTGS VGEASS ( 


6226 


29 


266 


TKVSELLGGSQPXFFIiPLWRRLCRCGLGPRVS PMAG PRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6*227 


2581 


890 


MSASSLLEQRPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP 
NNAYTAMSDS YLPS YYS PS IGFS YS LGEAAWS TGGDTAMP YIiTS 
YGQLSNGEPKFLPDAMFGQPGALGSTPFLGQHGFN?KPSGIDFS 
AWGNNSSQGQSTQS SGYSSNYAYAPSSLGGAM 1 DGQSAFANETL 
NKAPGMNTIDQGMAALKLGSTEVASNVPKVVGSAVGSGS ITSNI 
VASNSLPPATIAPPKPASMADIASKPAKQQPKLKTKNGIAGSSL 
PPPP I KHNMD IGTWDNKGP VAKAPSQAL VQNIGQ PTQGS PQPVG 
QQANNSPPVAQASVGQQTQPLPPPPPQPAQIjSVQQQAAQPTRWV 
APRNRGS GFGHNGVDGNGVGQS QAGSGSTPSEPHP VLEKLRS IN 
N YN PKDFDWNLKHGRVFI IKS YSEDD I HRS I KYN I WCSTBHGNK 
RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VWSQDKWKGRFDVRWIFVKDVPNSQLRHIRLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTS IFDDFSHYEKRQ 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide" 
(AeAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G»Glycine, 
H«=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T»Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


" *228 


j 47 


1978 


GRRCRRRGAVMELAQEARELGCWAVEEMGVPVAARAPKSTLRRL 
CLGQGADI WAY ILQHVHS QRT VKKIRGNLLW YGHQDS PQVRR KL 
ELEAAVTRLRAEIQEIiDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAGAMRRQQHTLRDPMQRLQNQLRRLODMERKAKV 
DVTPGSLTSAAU3LEPWLRDVRTACTLRAQPLQNLLLPQAKRG 
S LPTPHDDHFGTS YQQWLSSVETLLTNHPPGHVLAALEHLAAER 
EAEIRSLCSGDGIX3DTEISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTLLKERQVLTQRLQGLVEEVERRVLGSSERQVL 
I LGLRRCCL WTELKALHDQSQELQDAAGH RQLLLRE LQAKQQRI 
LH WRQLVEETQEQVRLLI KGNSASKTRLCRSPGE VLALVQRKVV 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPS IKQLHPAS PRG S S F I ALS H KLGLP PG KAS E LLLPAAAS L 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
QKENLGQALKRLEKLLKQALERIPELQGIVGDWWEQPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALQKLCS 


6229 


1571 


560 


GPSLI^TRGTPNPARTLQIFFLlIGRRLTGRMAAVDDLQFEfifG" 
NAATSLTAJN PDATTVN I EDPGETPKHQPGS PRGSGREBDDELLG 
NDDSDKTELLAGQKK5 S P FWTFB YYQTFFDVDTYQVFDR I KGSL 
LPIPGKNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLSNFLI 
HLGEKTYHYVPEFRKVSIAATIIYAYAWLVPLALWGFIjMWRNSK 
VMNI VSYS FIiE IVCVYGYSLPI YI PTAIfcWI I PHKAVRWILVMI 
ALGISGSLLAMTFWPAVREDtmRVAJQATIVTIVLLHI^LSVGCL 
AYFFDAPEMDHLPTTTATPNQTVAAAKSS 


6230 


1723 


600 


SKMSGRSGKKKMSKLSRSARAGVI F PVGRLMRy L5fCKGTFKYR~IS 
VGAP VYMAA VI EYLAAE I L ELAGNAARDN KKAR IAP RH I LLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPE KRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAE IDLKED1 GKALEKAGGKEFLETVKELRKSQGPLEVAEAAV 
SQS SGLAAKFVIHCH I PQWGSDKCEEQLEET I KNCLS AAEDKKL 

KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSES IGI YVQEMAKLDAK 


6231 


149 


870 


lilFSSSTMDkSLRNVLVVSF'fiFLLLFTAYGGLQSLQSSLYSB^ 
LGVTALSTtiYGGMLLSSMFLPPLLI ERLGCKGTI ILSMCGYVAF 
SVGNFFASWYTLIPTSILLGLGAAPLWSAQCTYLTITGNTHAEK 
AGKRGKDMVNQYFGIFFLI FQSSGVWGNIiISSLVFGQTPSQETIi 
PEEQLTS CGAS DCLMATTTTNSTQR PSQQLVYTLIjGI YTGSG VL 
AVLM I AAFLQP I RD VQRESE 


6232 
6233 " 


3679 
1 


1476 
2654 


FVAGTTMAGFWVGTAPLVAAG^GRWP 

YYS RQCLMVS RNLGSVG YDPNEKTFDK I LVANRGE I ACRVI RT C 
KKMG I KTVAI HS D VDAS S VHVKMADEAVCVG P AP TS KS YLNMDA 
I ME A I KKTRAQAVHPG YG FLSENKE FARdiAAEDWFIGPDTHA 
I QAMGDKIES KLIAKKAEVNTI PGFDG VVKDAEE AVRIARE I G Y 
PVMIKASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
LXEKF XDNPRHIEXQVLGDKHGNALWLNERECS IQRRNQKWEE 
APSIFLDAETRRAMGEQAVAIiARAVKYSSAGTVEFLVDSKKNFY 
FUSMNTRLQVEHP VTEC I TGLDLVQ EM I RVAKG YP LRHKQAD I R 
ING WAVECRVYAEDP YKS FGLPS IGRLSQYOB PLHLPGVRVDS G 
1QPGSDIS1YYDPMISKLITYGSDRTEALKRMADALDNYVIRGV 
THNIALLREVIINSRFVKGDISTKFLSDVYPDGFKGHMLTKSEK 
NQLLAIASSIjFVAFQLRAQHFQENSRMPVIKPDIANWELSVKLH 
DKVHTWASNNG S V7S VEVDGSKLNVTSTWNLAS PLLS VS VD GT 
QRTVQCLSREAGGNMSIQFI^TVYKVNILTRLAAELNKFMLEKV 
TEDTSSVLRSPMPGWVAVS VKPGDAVAEGQEICVI EAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVEIiE 
HSTRENLNAGNFNFPSEGHLVRSTGPGGSFAKHMVAQCVSPKGP 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
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correspond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AWttanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, MeMethionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, VaValine, 
W«Tryptophan, YaTyrosine, X=Unknown , *»Stop 
Codon, /«possible nucleotide deletion, 
\spossible nucleotide insertion) 








LACSRT YFFGATHVP YLGGDS KLP KKTEQ I RLLSQ I YAAVI EAV 
LAG I AC YAKTS S LTKAKEVAEQTLGSGLDS FEL I P F KAALRS KM 
TFHIHAVNNQGRIVPIiDSEDSLSFVKTACMAVYDI PDLLGGNGC 
LGSWFSESFLTSQILVKEKDGTVTTETSSWLTAAVPRFCSVJL 
VEDNEVKL S E KTHQAVRGDE S FLGTYLTGG EGAYL YS S NLQ S WP 
EEGNVHFFS SGLLFSHCRHGS I 1 1 SKDHMN3 1 S FYDGDSTS TVA 
ALLIDFKSSLLPHLPVHFHGSSNFLMIALFPKSKIYQAFYSEVF 

CT HtrA^\TMv?Or*YC»T VtT TADIVT OWnVBT UOP^rVlTT pCRT O^tSTVI** 

SLWKQuDNSGXSLKv XQEDGIjSV 

EKRSSLKLLSAKLPELDWFLQHFAISSISQEPVMRTHLPVLLQQ 

AEINTTHRIESDKVIISIVTGLPGCHASELCAFLVTLHKECGRW 

M VYRQI MDSSECFHAAKFQRYLS SALE AQQNRSARQSAYIR KKT 

RLLWLQGYTDVIDWQALQTHPDSNVKASFTIGAITACVEPMS 

CYMEHRFLFPKO^CSQGLVSNVVFTSaTTEQRHPLLVQLQSL 

IRAANPAAAF1LAENGIVTRNEDIELILSEMSFSSPEMLRSRYL 

MYPGWYEGKLMAGS\nrPLMVQICVWFGRPLEKTRFVAKCKAIQS 

S I KPS P FS GNI YH I LGKVKFSDSERTME VCYNTLANSL S I MP VL 

EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWLRQSA 

KQKPQRKALKTRGMLTQQEI RS IHVKR H LE P LPAG YFYNGTQF V 

NFrGDKrUfcHPLMDQFMNDYVEKA^ 

ELKP 


6234 


1731 


404 


P R VREJDMDHKS PGNKGS LVYAG I KS IVKSSLGMVESSRHNWSGL 
D KQSDI QNLNEERI LALQLCGWI KKGTDVDVGPFLNSLVQEGB W 
ERAAAVALFNLD IRRAIQ ILNEGAS SE KGDI^LNVVAMALSGYT 

KVAVRDRVAFACKFLSDTQLNRYIEKLTNEMKEAGNLEGI LLTG 
LTKDGVDLMES YVDRTGDVQTAS YCMI >QGSPLDVLKDERVQYW I 
ENYRNLL DAWR FWH KRAEFD I HRS KLD P S S KP LAQ VFVS CNFCG 
XSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
ALCLINMGTPVSSC PGGTKS DEKVDLS KDKKLAQFNNWFTWCHN 
CRHGGHAGHML^WFRDHAECPVSACTCKCMQLI^TTGNLVPAETV 
QP 


6235 


1 


571 


EKRDHRLPSWPRAALKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
D YE SDDDS YBVLDLTE YARRHQW WNR VFGHS SGPMVE KYS VATQ 
I VMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQI ASHSG YVQ I 
DHKRVE KDVNKAKRQ I KKRANKAAPE INNLI EEATEF I KQN1 VI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARIMAE 
NAI RQKNQAVN FLRMS ARVDAVAARVQTAVTMGKVT KS MAG WK 
SMDATLKTMNLEKISALMDKFBHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDM LLQEMADEAGLDLNMELPQGQTGS VGTS VAS AEQ 
DELSQRLARLRDQV 


6237 


312 


720 


ptamaeegiaaggvmdvntalqbvlktalihtolargireaaka 
ldkrqahlcvlasncdepmwklvealcaehqinlikvddnkkl 
gewvgl<:kidregkprkwgcscvwkdygkesqakdviebyfk 

CKK 


6238 ' 


? 


4666 


eevptqesvkweinviiknpeivfvadmtkndapalvittqgei 
cykgnlenstmtaaikdlqvracpflpvkrkgkittvlqpcdlf 
yqttqkgtdpqvidmsvksltlkvspviiktmititsalyttke 
tipeetasstahlwekkdtktlkmwfleesnetekiapttelvp 
kgem x kmnxds i fi vleagighrtvpmllaksrfsgegknwssl 

INIjHCQLELEVHYYNEMFGVWEPIiLEPIjEIDQTEDFTIPWNLGIK 

mkkkakmaivesdpeeenykvpeyktvisfhskdqlnitlskcg 
lvt^lnnlvkafteaatgssadfvkdlapfkilnslgltisvsps 

DS FS VLN I PMAKS YVLKNGE SLSMDY I RTKDNDHFNAMTS LSS K 
LFFILLTPVNHSTADKIPLTKVGRRLYTVRHRESGVERSIVCQI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, Ba 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
peProline, Q=Glutamine, R^Arginine, 
S=Serine, TaThreonine , V«Valine, 
W=Tryptophan, Y«Tyxosine, X«Un)cnown, **Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








DT VEGS KKVT I RS P VQ IRNHFS VPLSVYEGDTLLGTAS PBNEFN 
I PLGSYRSFI FLKPEDEN YQMCEG I DFB E 3 1 KNDGALLKKKCRS 
KNPSKESFLINIVPEKDNLTSLSVYSEDGVIDLPYIMHLWPPILL 
RNLLP YKI AYYIEG I ENSVFTLSEGHSAQ I CTAQLGKARLHLKL 
LDYLNHDWKSEYHIKPNQQDIS FVSFTCVTEMEKTDLDIAVHMT 
YNTGQTWAFHSPYWMVNKTGRMLQYKADGIHRKHPPNYKKPVL 
FS FQ PNH FFNNNKVQtiMVTDSELSNQFS I DTVGSHGAVKCKGLK 
MDYQVGVTIDLSSFN1TRIVTFTPFYMIKNKSKYHISVAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYFNKQENC 
I LLRLDKELGG 1 I AE VNLAEHS T VI TFLD YHDGAATFLLINHTK 
N E LVQ YNQS S LS E I EDS L P PGKAVFYTWAD P VGS RRL KWRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES E KAELAEQE I AVALQDVG I S LVNNYTKQE VAY IG ITS SD 
WWETKPKKKARWKPMSVKHTBKLEREFKBYTESSPSEDKVIQL 
DTNVPVRIiTPTGHNMKILQPHVIALRRNYLPALKVEYNTSAHQS 
SFRIQIYRIQI QNQ IHGAVFPF VFYP VKPPKS VTMDSAPKPFTD 
VSIVMRSAGHSQISRIKYFKVLIQEMDLRLDLGFIYALTDLMTE 
AEVTENTEVELFHKDIEAFKEBYKTASLVDQSQVSLYEYFHISP 
I KLHLS VSLSSGREE AKDS KQNGGL I PVHS LNI»LLKS I GATLTD 
VQDWFKLAFFELNYQFHTTSDLQSEVIRHYSKQAIKQMYVIjIL 
G LDVLGNP FGL I REFSEGVEAF FYE P YQGAI QGPEE FVEGHALG 
LKALVGGAVGGIJ^AJ^KITGAMAKGVAAMTMDEDYQQKRREAM 
NKQPAGFREG ITRGGKGLVSGF VSG I TG I VTKP I KGAQKGGAAG 
FFKGVGKGLVGAVAR PTGG 1 1 DMAS STFQG I KRATETS EVES LR 
PPRFFNEDGVIRPYRLRDGTGNQMIiQKIQFYREWIMTHSSSSDD 
DDDDDDDDBSDLNH 


6239 


2108 


634 


KPGMAGKGSSGRRPLLLGLLVAVATVHLViCPYTKVEESFNLQA 
TKDLLYHWQDLEQ YDHLE FPGVVPRTFLGPVVIAVFSS PAVYVL | 
SLLEMS KFYSQLI VRGVLGLGVI FGLWTLQKEVRRHFGAMVATW : 
FCWVTAMQFHLM PYCTRTLPNVLAIiPVVLIjAIJ^WIjRHEWARFI 
WLSAFAI I VFRVELCLFLGLLI^LAIXtNRKVS WRALRHAVPAG 
ILCLGLTVAVDSYFWRQLTWPEGKVLWYNTVTjNKSSNWGTSPLL 
wyfysalprglgcsllfi PLGLVDRRTHAPTVLALGFKALYSLL 

GHLWNAAYSATALYVSHFNYPGGVAMQRLHQLVP pqtdvllhi 
DVAAAQTGVSR FLQVNS AWR YDKRED VQPGTGM1AYTH I LMEAA 
PGLLALYRDTHRVLASWGTTGVSLNLTQLPPFNVHLQTKLVLL 
ERLPRPS 


6240 


2202 


117* 


HERGDS LKE PTS I AESSRH PS YRSEPSLEPES FRS PTFGKS FHF 
DPLSSGSRSSSLKSAQGTGFELGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDF3SVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHREPSPVRYDNLSRHIVASLQEREKL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSIiGS 
ASPG PGQP PLSS PTRGGVKKVSGVGGTTYE I S V 


6241 


3 


1341 


RNAEEKKRLSLQREKI IARVS IDNRTRALVQALRRTTDPKLCIT 
R VE ELTFHLLEFPEGKGVAVKERI I P YLLRLRQ I KDE TLQAAVR 
EILAL IG YVD P VKGRGIR I LS IDGGGTRGWALQTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVI VGT\^KMSWSHAFYDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPG INSHYLGGCQYKM 
^AIRASSAAPGYFAEYALGNDL^QDGGliLLNNFSALAMHECKC 
LWPDVPLECIVS LGTGRYBSDVRNTVTYTSLKTKLSNVINSATD 
TEEVHIMLDGLLPPDTYFRFNPVMCENIPLDESRNEKLDQLQLE 
GLK Y I E RNEQ KM K JCVAK I L S Q E KTTLQKI NDW I KLKTDM YEGL P 



472 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A= Alanine, C=Cysteine, D-Aspartic Acid, E«= 
Glutamic Acid, F«- Phenyl alanine, G»Glycine, 
H«Histidine, I=Isoleucine, K«Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=*Serine, T=Threonine, V=Valine, • 
W=Tryptophan, Y=Tyrosine, XeUnknown, *»Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








FFSKL 


6242 

• 


198 


1310 


QHFLPGAETW S PGAAVCTARR F PGRSLAAFP RPAAPRRAVBMGE 
SSEDIDQMFSTLLGBMDLLTQSLGVDTLPPPDPNPPRAEFNYSV 
GFKDLNESLNAIiEDQDLDALMADLVADISEAEQRTIQAQKESLQ 
NQHHSASLQAS I FSGAASLGYGTNVAATGISQYEDDLPPPPADP 
VliDIiPLPPPPPEPLSQEEEEAQAKADKIKLALEKLKEAKVKKLV 
VKVHMNDNSTKSLMVDERQLARDVLDNLFBKTHCDCNVDWCLYB 
IYPELQIERFFEDHENWEVLSDWTRDTENKILFLEKEEKYAVF 
KNPQNFYLDNRGKKESKETNEKMNAKNKESLLEVRLILQSGRKB 
KDVCS I FKS FASENNGKI 


6243 


1509 


614. 


RSASRFSGCWSRDSTCCCePSTCWSRSSA^ePRARWPPSSAPAT 
TS RAS S RRLACG PQTRAGAETRSTAM I RANS AARDTRRATCRSA 
AGTPSPTTMTCLTDVPTGCAAVEPTARLPAAAWASTITTGCCPA 
MGQAGAGPAGRKGSEAGGGPGRAHHAHPSPLPREPRVRTGPPAH 
SPT PGS ID PS PELS WG SAG VTQE S PLLDP VDFLLFRTRAVDPLR 
R VFFF FYQHLTFFS IQPQP PPCHAFHPRDPPAGTKRQliI LVPLK 
GPPILAPILSLTPILSRMSCYFPRSRIAQGWHLS 


6244 


2119 


1745 


FEHAYASQFGTFIiGNNBSERCKLKIiQQKTMSLWSWVNQPSELSK 
FTNPL FEANNLVI WPS VAPQSLPLWEG I FLRWNRSS KYLDEAYE 
EMVNI IEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDIiNDNICKRYIKMITNIVILSIil"" 
I CI S LAFWI 1 3MTA3TY YGNLRP IS P WRWL FSWVP VLI VSNGL 
KKKS LDHSGALGGLWG FX LiTIANFS FFTSLLMFFLSSSKLTKW 
KGEVKKRLDSEYKEGGQRNWVQVFCNGAVPTELALLYKIENGPG * 
E I P VDFS KG/YSAS WMCLS LliAALACS AGDTWASEVGPVLSKSS P i 
RL I TT WE KVP VGTNGG VTWGLVS S LLGGTFVG I AYFLTQL I FV 
NDLDI SAPQWP I IAFGGLAGLLGS I VDSYLGATMQYTGLDESTG 
MWNS PTNKARH IAGKP I LDNNAVNL FS S VL IALLLPTAAWGFW 
PRG 


6246 


1177 


359 


SLWPV^ILMODSIiMQISLQLLCVYTANFPNGCSSLCWSSCGQHPV 
OATHRGAVSNSI^LCILKLASQMPLENTTVQQMVFMLLSNIiAIjS 
HDCKGVlQKSNFLQNFLSIjALPKGGNKHLSNLTILWLKLLLNIS 
S GEDGQQMI LRLDGCLDLLTEMSKYKHKSSPLLPLL I FHNVCFS 
PANKPKILANEKVITVLAACIiESENQNAQRlGAAALWALIYNYQ 
KAKTALKSPSVKRRVDEAYSLAKKTFPNSEANPLNAYYLKCLEN 
LVQLLNSS 


6247 


3 


1678 


NSRWGPWTEPSAGSLRPMARKQNRNSKELGLVPLTDOTSHAGP ' 
PGPGRALLECDHLRSGVPGGRRRKDWSCSLLVASLAGAFGSS FI* 
YGYNLSWNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 
S I FAIGGIjVGTL I VKMIG KVTjGRKHTLLANNGFAISAALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
QVTAIFICIGVFTGQI»LGLPELLGKESTWpyi,FGVIVVPAVVQL 
LSLPFLPDSPRYLLLEKHNEARAVKAFQTFLGKAHVSQEVBEVIi 
AESRVQRS IRLVSVLELLRAPYVRWQWTVIVTMACYQLCGLNA 
I WFYTNS I FGKAG I P PAKI PYVTLSTGG I ETLAAVFSGLV1EHL 
GRRPLLIGGFGIiMGLFFGTLTITLTLQDHAPWVPYLSIVGILAI 
IASFCSGPGGI PFI LTGEFFQQSQRPAAFI I AGTVNWL5NPAVG 
LLFPF I Q JCSLDTYCFL VFATI CITGAI YL YFVLPETKNRTYAE I 
SQAFSKRNKAYPPEEKI DSAVTDGKINGRP 


6248 " 


56 


1773 


VP PPRMMAAVPPGLE PWNRVRI PKAGNRSAVTVQNPGAALDLCI ~" 
AAVI KECHLVILS LKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQ VEQCLKRLKNMNLEGS I QDLFELFS SNENQPLTTKVCWP 
SQPVVELVLMKVI^ACKLLLRI^CCCKTFLLTVKHliGLQEFI I 
LNIi VMVGLVS RLW VL YKGVLKRL I Uu YE PLFGLLQE VARIQPM P 
YFKDFTFPSDITEFLGCPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
s-uttc o puna i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(/WAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I*Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RaArginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *aStop 
Codon, /«possible nucleotide deletion, 
\opossible nucleotide insertion) 








KEESSBFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 
FLGNKLL KSNRL KHLEAQGTSL P KKLEC I KTS I CNHLLRGSG I K 
TS KHHLRQRRS QNKFLRRQRKPQR KLQS TLLRE IQQ FSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLLNSGVSMPVIQTKEKMI 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDIFALMGV 


6249 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVILSLKSQTLDABTDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
SQ P WELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGIjQEFI I 
LNL VMVGLVSRLWVLYKG VLKRL I LL YE PL FGLLQE VAR I QPM P 
YFKDFTFPSDITEFLGKJPYFEAFKKKMPIAFAAKGINKLLNKliF 
LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVXNKRVF 
KBESSEFDVRAFCNQIiKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECIKTSICNHLLRGSGIK 
TS KHHLRQRRS QNKFLRRQRKPQR KLQS TLLRE IQQ FSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVIQTKEKMI 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDI FALMGV 


6250 


232 


1306 


liAALHIMALPFRKDLEKYKDLDEDELIiGNLSETELKQLETVIiDD 
LDPENALL PAG FRQKNQTSKSTTG PFDREHLLS YLEKEALEHKD 
REDYVPYTGEKKGKIFIPKQKPVQTFTEEKVSIiDPELEEALTSA 
SDTELCDLAAILGMHNLITNTK^CNI^SSNGVIX3EHFSNVVKG 
EKI LPVFDEP PNPTNVEESLKRTKENDAHLVEVNLNNI KNIPIP 
TLKD FAKALETNTHVKCFS LAATRSNDPVATAFAEMLKVNKTLK 
SLNVESNF I TGVG ILAL I DALRDNETLAE LKI DNQRQQLGTAVE 
LEMAKMIiEENTNILKFGYQFTQ.QGPRTRAANAITKNNDLVRKRR 
VEGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNXWS KVRHI KGPKDVERSRI FS 
KLCLN I RLAVKEGGPNP EHNSNLAN I LEVCRS KHMPKSTI ETAL 
KMEKS KDT YLLYEGRGPGGS SLL I EALSNS SHKCQADIRH I LNK 
NGGVMAVGARHSFDKKGVIVVEVEDREKKAVNLERALEMAIEAG 
AEDVKETEDEEERNVFKFICDASSLHQVRKKLDSLGLCSVSCAL 
EFI PNSKVQLAEPDLEQAAHLIQALS NHEDVIHVYDNIE 


6252 


27 


1897 


EEFCTWIAV^VGFJ^ETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRS RLE VAEAE EEETS I KAARS ELLLAEE PGFLEGE 
DGEDTAKI CQAD I VEAVD I ASAAKH FDLNLRQFG P YRLNYSRTG 
RHIAFGGRRGH VAALDWVTKKLMCE I NVMEAVRD IRFLHSEALL 
AVAQ^RWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASE 
TGFLT YLDVS VGKI VAALNARAGRLDVMSQNP YNAVIHLGHSNG 
TVSLWS PAMKE PLAKI LCHRGGVRAVAVDSTGTYMATSGLDHQL 
KIFDLRGTYQPLSTRTLPHGAGHLAFSORGLLVAGMGDVUKTTUa 
GOGKASPPSLEQPYLTHRLSGPVHGI.QFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELI C 
LDPRALAEVDV1SLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKIiQTKRKKPkRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKMPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRS R LEVAEAEE EETS I KAARS ELLLAE E PGFLEGE 
DGEDTAKI CQADIVEAVDI AS AAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCE INVMBAVRD I R FLHSEALL 



474 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C«Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K*Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
PeProline, Q=Glutamine, R=Arginine, 
S a Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyroeine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








AVAQNRWLHI YDNQGIELHCI RRCDRVTRLEFLPFHFLLATASE 
TGFLTYLDVSVGKIVAALNARAGRIjDVMSQNPYNAVIHLGHSNG 
TVSaLWS PAMK EPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGT YQ PLSTRTLPHGAGHLAFSQRGLLVAGMGDVVNI WA 
GQG KAS P P S L EQP YLTHRLS G P VHGLQ FC P F ED VLGVGHTGG I T 
SMLVPGAGEPNFDGLESNPYRSRKQRQBWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPALAPGAAAFAGL 
GGA PRFP PRGSAAGRTMLLKE YRI CMPLTVDE YKIGQLYMIS KH 
S HEQSDRGEGVE WQNE P FEDPHHGNGQFTEKRVYLNS KLPS WA 
RAWP KI F YVTEKAWN YYP YTI TE YTCSFLPKFS IH 1 BT KYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYIOiVTVKFEVWGIiQTRVEQFV 
HKVVRJ)ILLIGHRO^FAV^EWYDMTMDDVREYEKNMHEO/miK 
VCNQHS S P VDD I S3 RAQTST 


6255 


1 


1444 


PTRPQQBLLVSLATVIFVASQKALSVESKAVIKQQLESVSNGWT 
VYR IARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSliKEFS 
HAEQCLTGLQEENYSSALSCIABSLKFYHKGIASLTAASTPLNP 
LSFQCEFVKLRIDLLQAFSQLICTCNSLKTSPPPAIATTIAMTL 
GNDLQRCGRI SNQMKQSMEEFRSLAS RYGDLYQAS FDADSATLR 
NVELQQQS CliLI S HAI E AL ILDPESAS FQE YGSTGTAHADSE YE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIALLKVPL 
S FQR YFFQKLQSTS I KLALS PS PRNPAEP IAVQNNQQLALKVEG 
WQHGS KPGL FRKI QS VCLNVSSTLQS KSGQDYKI PIDNMTNEM 
EQRVEPHNDYFSTQFLLNFAILGTHNITVESSVKDANGIVWKTG 
PRTTIFVKSLEDPYSQQIRLQQQQAQQPLQQQQQRNAYTRF 


6256 


1 


1542 


CRGAGAE PAANPRSPRS LVPSLES TS TS V PPAPGTMATDS W ALA 
VDEQEAAAESLSNLHLKEEKIKPD-mGAWKTNANAEKTDEEEK 
EDRAAQSLLNKLI RSNLVDNTNQVEVLQRDPNSPLYSVKSFEEL 
RLKPQLLQdVYAMGFNRPSKIQENALPLMLAEPPQNLIAQSQSG 
TGKTAAFVLAMLSQVEPANKYPQCLCI^PTYELALQTGKVIEQM 
GKFYPELKLAYAVRGNKLE RGQKI SEQ I VIGTPGTVLDWCS KLK 
FIDPKKIKVFVLDEADVMIATQGHQDQSIRIQRMLPRNCQMLLF 
SATFEDSVWKFAQKWPDPNVIKLKREEETLDTIKQYYVLCSSR 
DEKFQALCNLYGAI T I AQAM I FCHTR KTAS WLAAELS KEGHQVA 
LLSGEMMVEQRAAVI ERFREGKEKVLVTTNVCARG IDVEQVSVV 
INFDL P VDKDGNPDNETYLHR IGRTGR FGKRGLAVNMVDS KHSM 
NILNRIQEHFNKKIERLDTDDLDEIEKIAN 


6257 


210 


£15 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 
NNIVKEELALLDGSNWFKLLGPVLVKQELGBARATVGKRLDYI 
TAEI KRYESQLRDLERQSEQQRETLAQLQQE FQRAQAAKAGAPG 
KA 


6258 


210 


615 


Kr X VJWlPdsu U X y KJUjU<j£ TV a K I U U-Ly i^iJ>iKSMSGRQlUjEAQLTE 

NNIVKEEIALJJDGSNWFKLLGPVLVKQELGEARATVGKRLDYI 
TAEIKRYESQLRDLERQSBQQRBTLAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 


1540 ! 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNNKTV' 
S VENGDRGSKTFNLGTDPVSLRN YPYKI CDS CEMNLKNISGL 1 1 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
N YHQDLSQPS FGQS FEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNEOGRTFIESLKLNISQRPHLEMEPYGCSICGKSFCMNLRFGH 
QRALT KDNP YE YNE YGE I FCDNSAFI I HQGAYTRKI LRE YKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
OEKTFECGECXSKTFWEKSNLTQHQRrHTGSKPYECTECGKAFCQ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, MoMethionine, N-Asparagine , 
P«Proline, Q-Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGE KP YECKQCGKTFCVKSNIjTEHQRTHTGE KP 
YE CN ACGKS FCHRS ALTVHQRTHTGEKPF I CNE OGKS F CVKSNL 
I VHQRTHTGEKPYKCNEOGKTFCE KSALTKHQRTHTGEKPYECN 
ACG KTFS QRSVLTKHQR IHTRVKALSTS 


6260 


2081 


1436 


GTG PE IHACAHAS ARAPGS RAMALREIiKVCLLGDTGVG KS S I VW 
RFVEDS FDPNI NPT I GAS FMTKTVQ YQNELHKFLI WDTAGQERF 
RALAPMYYRGSAAAI I VYDITKEETFSTLKNWVKELRQHGPPNI 
WAI AGNKCDL IDVRE VMERDAKD YADS IHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


fwyrlgpgtrsrwprrgswaaslvprgpspaalvtspcppdplr 

SPACEPCRPDFAPRPAIiliRSGPRSAPAVTGKPALKGQPGPWPG 

maevsidqsklpgvkevcrdfavlbdhtlahslqeqeiehhlas 
nvqrnrlvqhdlqvakqlqeedlkaqaqlqkrykdleqqdceia 
qeiqeklai eaerrriqekkdediarllqekelqeekkrkkhfp 

EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
E I ARKLQE E E LLATQVDMRAAQVAQDE E IARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RP P PPIMTDGEDADYTH FTNQQS STRHFSKSES SHKGFHYKH 


6262 


2 


1759 


PBCHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEEIL 
GS TRI» VSQGL EALRS BHQAVLQS I tSQTI ECLQQGG H EEGL VHEK 
ARQLRRSMENI ELGLSEAQVMLALASHLSTVESEKQKLRAQVRR 
LCQENQWLRDEIAGTQQRLQRSEQAVAQLEEEKKHLEFLGQLRC! 
YDEDGHTSBEKEGDATKDSIiDDLFPNEEEEDPSNGLSRGQGATA 
AQO^GYEIPARLRTLHNLVIQYAAQGRYBVAVPLCKQALEDLER 
TSGRGHPDVATMLNI LALVYRDQNfCYKEAAHLLNDALS I R ESTL 
GPDHPAVAATLNNLAVLYGKRGKYKEAEPLCQRAJuEIREKVLGT 
NHPD VAKQLNNLALL CQNQG KYEAVER YYQRALA I YEGQ LG PDN 
PNVARTKNNLASCYLKQGKYAEAETLYKE I LTRAHVQEFGS VDD 
DHKP I WMHAEEREEMSKSRHHEGGTPYAEYGGWYKACKVSS PTV 
NTTLRNI^iALYRRO/SKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


2408 


RELDSLADLPERIKPPYANGLSTSHIjRSSSVBDVKLlISEGRPT 
IEVRRCSMPSVICEHTKQFQTISEESNQGSLLTVPGDTSPSPKP 
EVFSNVPERDLSNVSNIHSSFATSPTGASNSKYySADRNLIKNT 
AP VN T VMDS PVH L E PS S Q VG VI QNKSWEM P VDRLETLST RD F I C 
PNSNI PDQESSLQS FCNSENKVLKENADFLS LRQTELPGNS CAQ 
DPAS FMP PQQP CS FPSQS LSDAES IS KHMSLS YVANQE PG I LQQ 
KNAVQIISSALDTDNESTKDTENTFVLGDVQKTDAFVPVYSDST 
IQEAS PNFE KAYTLP VLPSEKD FNGS DASTQLNTHYAFS KLTYX 
S SSGHE VENSTTDTQ VI SHEKENKLESLVLTHLSRCDSDLCBMN 
AGMPKGNLNEQDPKHCPESEKCLIiS IEDEESQQS ILSSLENHSQ 
QSTQPEMHKYGQLVKVELEENABDDKTENQI PQRMTRNKANTMA 
NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
vokv v y ±* vy vb fo LjIjUAat. K 1 S JUAAI VDSLKLDE IQP YSSER 
ANP Y FE YLH I RKKI E E KRKLLCS V I PQAPQYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQ£EVVRMKLRLQHS IE 
REKL I VSNEQEVIiRVHYRAARTIiANQTLPFS ACTVLLDAEVYNV 
PLDSQS DDS KTS VRDRFNARQFMSWLQDVDDKFD KLKTCLLMRQ 
QHEAAALNAVQR LEWQLKLQ ELDPAT Y KS IS I YE I QEF YVPLVD 
VNDDFBLTPI 




143 


19*0 


KHRQE20JAIJ3MAPEIHrn^PMCLIENTNGELVANPSALKIIiSAI 
TQPVVWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
HMWCVPHP KKPEHTIjVIjLDTEGIjGDVKKGDNQNDS W I FTLAVLL 
SSTLVYNSMGTINO^AMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
L^Leucine, M=Methionine, N«Asparagine, 
P=*Proline, Q=Glut amine, R=Arginine, 
S -Serine, T«Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVFDLPiHRRKLAQLEl5^DE^ 
ELDPEFVQQVADFCS Y I FSNS KTKTLSGGI KVNGPRLESLVLTY 
INAISRGDLPCMENAVLALAQIENSAAVQKAIAHYDQQMGQKVQ 
L PAETLQELLDLHR VSE REATE VYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQBA3 SDRCSALLQVI FS PLEEEVKAG IYSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQ I LTEKEKE I EVECVKAESAQASAXMVEEMQ1 K YQQMMEE K 
EKSYQBHVKQLTEKMERERAQLLEEQEXTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6265 


143 


1960 


KHRQENNALDMAPE I HMTGPMCLIENTNGELVANPEALKILSAt 
TQPVVWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WM WCVPH P KKP EHTLVLLDTEGLGDVKKGDNQNDS WI FTLAVLIi 
SSTLVYNSMGTINQQAMDQIiYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYXEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCS YI FSNSKTKTLSGG I KVNGPRLESLVLTY 
INAISRGDLPCMENAVIiALAQIENSAAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATE VYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQVI FS PLEEE VKAGI YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQILTEKEKEIEVECVKAESAQASAKMVEEMQIKYQQKMEEK 
EK5 YQEHVKQLTE KMERERAQLLEEQEKTLTS KLQEQARVLKE R 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQG 1 GLD&I PE VTAS E 
GFT VNE INKKS IHI SCPKENASSKFLAPYTTFSRIHTKS ITCLD 
I S S RGGLG VSS STDGTMKI WQASNGBLRRVL EGHVFDVNCCRFF 
PSG LWLSGGMDAQLKI WSAEDAS CWTFKGHKGG I LDTAI VDR 
G RNWSAS RD GTARL WD CGRS ACLGVLAD CG S S INGVAVGAADN 
SINLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRQLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGAPVLSLLSVRDGFIASQGDGSCFIVQQDLDYVTELTGADCD 
P VYKVATWE KQ I YTCCRDGLVRR YQLSDL 


■*2£7 


3 


622 


LGMMKKNNS AKRGPQDGNQQPAP PE KVG WVRKFCGKG I FRE I WK " 
NRYWLKGDQLYISEKEVKDEKNIQEVFDLSDYEKCEELRKSKS 
RSKKNHS KFTLAHS KQPGNTAPNLI FLAVS PEE KE S W INALNS A 
ITRAKNRILDEVTVEEDSYLAHPTRDRAKIQHSRRPPTRGHLMA 
VASTSTSDGMLTLDLIQEEDPS PEEPTSLC 


6268 


160 


1368 


HRELCQNLPAGLSSAL I DNPLTLLLS I DT YVMLQBP VTFQDVAV 

DFSREEWGLLGPTQRTEYRDVMLETFGHLVSVGWETTLENKELA 

PNSDIPEEEPAPSLKVQESSRDCALSSTLEDTLQGGVQEVQDTV 

LKQMESAQBKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 

SQANSGALDTNQVIJjHKI PPRKRLRKRDSQVKSMKHNSRVKIHQ 

KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 

IFRNPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 

GEKP* hCvECGRTFNDRSAISQHIJITHTGAKPYKCQD 

SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 

KKKQPTS 


6269 


2886 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL' - " 
TQ YVKWTND KS LGG I E G CLS KLKAADP T F VMGHAMATGL VL I GT 
GSS VKLDKELDLAVKTMVE ISRTQPLTRREQLHVSAVETFANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
I YPFWTPDIPLSS YVKGI YSFGLMETNFYDQAEKLAKEALSINP 
TDAWS VHTVAH I HEMKAE I KDGLE FMQHS E TLW KDSDMLACHNY 
WEWALYLI EKGE YE AALT I YDXHILP SLQANDAMLDWDS CSML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQELLTTLRDAS ES PGENOQHLLARDVGLPLCQALVEAE 
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Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end - 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ruuj.w ocau segment concaining signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, TaThreonine, V»Valine, 
VMxyptophan, Y-Tyrosine, X-Unknown, *-=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGNPDR VLELLLP I R YRI VQLGGSNAQRDVFNQLL I HAALNCTS 
SVHKNVARSLI^RDALKPNSPLTERLIRKAATVHLMQ 


6270 


23 


2086 


SVTVTIiGSEGDGRPPTYHIiEEMEQEPQNGEPAEiKIIREAYKKA 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRGISISSKESEHT 
GPG WES ARQMQQKMKETLQNVRTRLE I LE KGLATS LQNDLQE VP 
KLYPE FP P KDMCEKLPE PQ S FS S APQHAE VNGNTSTPSAGAVAA 
PASLSLPSQSCPAEAPPAYTPQAAEGHYTVSYGTDSGEFSSVGE 
EFYRNHSQPPPLETLGLDADELILIPNGVQIFFVNPAGEVSAPS 
YPGYLRIVRFLDNSLDTVLNRPPGFLQVCDWLYPLVPDRSPVLK 
CTAC^YMFPDTMLQAAGCFVGVVLSSELPEDDRELFEDLLRQMS 
DLRLQANWNRAEEENE FQI PGRTRPSSDQLKEASGTDVKQLDQG 
KKDVRHKGKRGKRAKDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
S E KVAHN I LS GAS WVS WGLVKGAE I TGKAI QKGASKLRER IQP E 
EKPVEVS PAVTKGLY I AKQATGGAAKVSQFLVDGVCT VANCVGK 
E IAPHVKKHGS KLVPES LKKDKDGKSPLDGAMWAAS S VQG FST 
VWQGLECAAKCIVNKVSAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYN I NN1G I KAM VKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAANVNVRGE KDEQTKEVKEAKKKDK 


6271 


32 


10^8 


GCGVKTAGMVGREKELS IHFVPG S CRLVEEEVNI PNRRVLVTGA - 
TGLLGRAVHKE FQQNNWHAVG CG FRRARP KFE QVNLLD S NAVHH 
1 1 HDFQ PHV I VHCAAERRPD WENQPDAASQLNVDASGNLAKEA 
AAVGAFLI Y I S SDYVFDGTNPP YREEDIPAP LNLYGKTKLDGKK 
AVLENNLGAAVLRIPILYGEVEKLEESAVTVMFDKVQFSNKSAN 
MDHWQQRFPTHVKDVATVCRQLAEKRMLDPS I KGTFHWSGNEQM 
c#r'iML>vj./ujAb WJuPbbHJjRPITDSPVIjGAQRPRNAQLDCSKL 
ETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


6272 


113* 


528 


GAVMEDAAAPGRTEGVLERQGAPPAAGQGGALVELTPTPGGLAL 
VS P YHTHRAGD PI/DLVALAEQVQKADBFIRANATNKLTVI AEQI 
QHLQEQARKVLEDAHRDANLHHVACNI VKKPGNI Y YLYKRESGQ 
QYFS 1 ISPKEWGTSCPHDFLGAYKLQHDLSWTPYEDIEKQDAKI 
SMMDTLLSQSVALPPCTEPNFQGLTH 


6273 


256 


843 


SCPRVSPECRSLGCQVMFSLPLNC^PDHIRRGSCWGRPQDliKIA 
SAAWNS KCHPGAGAAMARQHARTLW YDRPRYVFMEFCVEDSTDV 
n V lil aunKX v r b AUU VbL YNE I E F YAKVNS KDSQD ICRS S RS 
ITCFVRRWKEKVAWPRLTKEDIKPVWLSVDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLS R FRGCl*AGAIiLGI>CV6s F YEAHIXf 
VDLTS VLRHVQ S L EPDPGTPGS BRTEALYYTDDTAMARALVQSL 
LAKEAFDEVDMAHOFAflRYlf imDDOrtvnftntTTTTt/c'WT t unvm 

uniuMir kJEt V Lwl/Wittr MyO I XsJsJJ tr UKu X Unu VV J, V r XUxuLiN P KvR 

DVFEPARAQFNGKGSYGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHASSIXjYNGAII^AIAVHIJu^ESSSKHFLKQIiLGHMED 
LEGDAQS VLDARE LGMEERP YS SRLKKI GEIiLDQAS VTRBE WS 
ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSIiQRTLIYSI 
SLGGDTDT 1 ATMAGAIAGAYYGMDQVPES WQQS CEGYEETD I LA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVPRPAKT^FMVJCTMVGGQLKNLTGSLG 
GGEDKGDGDKS AAEAQGMSREE YE E YQKQLVEE KMERDAQFTQR 
KAERATLRS HFRDK YR LPKNETD ESQ IQMAGGDVE LPRELAKM I 
EEDTBEEEEKASVLGQLASLPGLNLGSIiKDKAQATLGDIiKQSAE 
KCHVM 


6276 


797 


37 


TLLPLP P LPDTEGM I LIiNTGLEGT VAENP VPI VHTPSGltf II/TLE ' 
S CLQQLATHPGHWG I HLQIAE PAALRP SLALLARLS SLGLLHWP 
VWVGAKI SHGSFS VPGHVAGREIiLTAVAEVFPHVTVAPGWPEEV 
LGSG YREQLLTDMLELCQGLWQ PVS FQMQAMLLGHSTAGAI GRL 
LASS PRATVTVEHNPAGGDYASVRTALLAARAVDRTRVY YRLPQ 
GYHKDLLAHVGRN 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HaHistidine, I-Isoleucine r KsLysine, 
L«Leucine, M-Methionine, N=Asparagine , 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, Vn valine, 
W=Tryptophan, YaTyrosine, X-Unknown, **»Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 


6277 


4600 


2744 


MAFRTEMGLY YS YFKT I VE APSFLNGVWMIMNDKLTEYPLVINT 
LKRFNLYPB VI LAS W V R I YTKIMDLIG IQTKI CWTVTIGEGLS P 
TESCEGLGDPACFYVAVI FI LNGLMMAIaPFI YGTYLSGSRLGGL 
VTVLCFFFNHGECTRVMWTPPLRESFSYPFL^MLLVTHILRA 
TKLYRGSLIALCISNVFFMLPWQFAQFVLLTQIASLFAVYWGY 
ID I CKLRKI I YIHMISLALCFVLMFGNSMLLTSYYASSLVI IWG 
ILAMKPHFLKI NVS ELS LW VI QGCFWLFGTV I LKYLTSKI FGI A 
NDAHI QNLLTS KFF3YKDFDTLLYTCAAEFDFMEK3TPLRYTKT 

LLLPWLVGFVAIVRKI i sdmwgvlakqqthvrkhqfdhge l vy 

HALOLliAYTALG ILIMRLKL'LTPHMCVMASLT CSROLFCWLPr 
KVHPGAI VFAILAAMS IQGSANLQTQWNIVGEFSNXiPQEELIEW 
1 KYSTKPDAVFAGAMPTMASVKLSALRP I VNHPHYEDAGLRART 
KIVYS MYSRKAAEEVKR EL I KLKVNYYI LEES WCVRRSKPG CS M 

PE 1 WDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNS VYKVLEVV 
KE 


6278 


3 


823 


1 LFRLVLLSLV YLLNS VATEERKPAB VL 1 VEGQQ YAWGT VLLE 
I R I ILE YCQG VDNI P S VTTDMLTRLSDLLKYFNSRS CQLVLG AG 
ALOWGLKTITTKNLALSSRCLQLIVHYIPVIRAHFEARLPPKQ 
YSMLRHFDHITKDYHDHI AEI SAKLVAX MDSLFDKLLS KYEVKA 
P VPSACFRNICKQMTKMHBAI FDLLPEEQTQMLFLRINAS YKLH 
LKKQLSHLNVI NDGG PQNGLVTADVAFYTGNLQALKGLKDItDliN 


6279 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGS IQHVYGAQHPPFDPLLHGTL 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELLAMA 
AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPS PP SGDLRLVKS VS ESHTS CPABSASDAAPLQRSQS LPHSAT 
VTLGGTSD PSTLS S SALSEREASRLDKFKQLLAG PNTDLEELRR 
LS WSG I PKPVRPMTMKLLSGYLPANVDRRPATLQRKQK3 YFAFI 
EHYYDSRNDEVHQDT YRQIH I D I PRMS PEAL I LQPKVTEI FER I 
LFIWAIRHPASGYVQQINDLVTPFFWFICEYIEAEEVDTVDVS 
GVPAEVLCNIEADTYWCMSKLLDGIQDNYTFAQPGIQMKVKMLE 
E LVSR IDEQ VHRHLDQHE VRY LQFAFRWMNNLLMRE VPLRCT I R 
LWDTYQSEPDG FSHFHLYVCAAFLVRWR KE I LEEKD FQE LLLFL 
QNLPTAHWDDEDI SLLLAE AYRLKFAFADAPNH YKX 


6280 


857 


2515 


eccdqkmgsrnsssagsgsgdpseglprrgaglrrseeeeeedb" 
dvdlaqvlayllrrgqvrlvqgggaanlqfiqalldseeekdra 
mdgrixidrynppvdatpdtrelefneiktqvelatgqlglrraa 
qkhsfprmlhqrerglchrgs fslgeqsrvi shflpndlgftds 

YSQKAFCG I YSKDGQ I FMS ACQDQTI RLYDCR YGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHI CN I YGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHBDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTIKLWDIRRFSSR 
BGMEASRQAAT^NWDYRWQQVPKKAWRKLKLPGDSS LMT YRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVVVYDLLSGHIVKK 
LTNHKACVRDVS WHPFEE KI VSSSMDGNLRLWQYRQAEYFQDDM 
PBSEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGA^LRRSEtiBEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGA71NLQFIQALLDSEEENDRA 
WDGRLGDRYNP P VDATPDTRELEFNE I KTQVELATG QLGLRRAA 
QKHS FPRMLHQRERGLCHRGS PSLGEQSRVISHFLPNDLGFTDS 
Y S Q KA FCG I YS KD QQ I FM S ACQDQT I R LYDCR YGRFRKFKS I KA 
RDVGWS VLDVAFTPDGNHFLYSSWSD Y IHICN I YGEGDTHTALD 
LRPDERRFAVFS I AVSS DG RE VLGGAN DGCLYVFD R EQNRRTL Q 
IESHEDDVNAVAFADISSQ I LFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTIKLWDIRRFSSR 
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NO: 



Predicted 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=l*ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y=>Tyrosine, X- Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



"6282" 



EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH"" 
GVXjHTL IRCRFSPIHSTGQQFIYSGCSTGKVWYDLLSGHI VKK 
LTNHKAC VRDVS WHP FEEKI VS SS WDGNIiRLWQ YRQAE YFQDDM 
PESEECASAPAPVPQSSTPF5S PQ 



125 



906 



RMAACRALKAVLVDLSGTLHIEDAAVPGAQEALKRLRGASVilR 
FVTNTTKESKQDLLERLRKLEFDISEDElFTStiTAARSLLERKQ 
VRPMIiLVDDRALPDFKGIQTSDPNAWMGLAPEHFHYQ I LNQAF 
RLLLDGAPL I AIHKARY YKRKDGLALG PGPF VTALE YATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
GILVKTGKYRASDgEKINPPPYLTCESFPHAVDHILQHLL 



6283 



140 



LSI, FGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTS P PRTAP 
KKQLPS I PKNALP ITKPTS PAPAAQSTNGTHAS YGPFYLEYSLIj 
AE FTLWKQ KLPG VYVQPS YRSALMWFGVI PIRHGLYQDGVFKF 
TVYIPDNYPDGDCPRLVFDIPVFHPLVDPTSGELDVKRAFAKWR 
RNHNHI WQ VIiMYARRVFYKIDTAS PLNPEAAVL YEKD IQLFKS K 
VVDS VKVCTARLFDQPKIEDPYAIS FSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKS VHVAGLS WVKPGS VQP FSKEEKTVAT 



6265 



2157 



RS VI PGSTI S S RWPGLSRPRFMAAHE WDWFQRE EIj IGQI S DI RV 
QNLQVERENVQKRTFTRWINLHLEKCNPPLEVKDLFVDIQDGKI 
LMALLE^LSGRNLLHEYKSSSHRIFRLNNIAKALKFLEDSNVKL 
VSIDAAEIADGNPSLVLGLI WNIILFFQIKELTGNLSRNS PSSS 
LAPGSGGTDSDSSFPPTPTAERSVAI S VKDQRKAI KALLAWVQR 
KTRKYGVAVQDFAGS WRSGLAFLAV I KA1DPS LVDMKQALENST 
RENLEKAFSIAQDALHI PRLLEPEDIMVDTPDEQS IMT YVAQFL 
ERFPELEAEDIFDSDKEVP I ESTFVR I KETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPS KVFVCDKPESMKEFRLDGVSSHAL3DS 
STEFMHQIIDQVIjQGGPGKTSDISEPSPESSILSSRKENGRSNS 
LP I KKT VHFEADTYKDP FCS KNDS LCFEGSPRVAKES LRQDGHV 
LAVEVAEEKEQKQESSKIPESSSDECVAGDIFLVEGTNNNSQSSS 
CNGALESTARHDEESHSLSPPGENTVMADSFQIKVNLMTVEALE 
EGDYFEAIPLKASKFNSDLIDFASTSQAFNKVPSPHETKPDEDA 
EAFENHAEKLGKRSIKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
LAPHBDHQQRETKENDPMDSHQSQES PNLENI ANPLEENVTKES 
ISSKKKEKRKHVDHVESSI»FVAPGS VQSSDDLEEDS SDYSI PSR 
TSHSDSSIYLRRHTHRSSESDHFSLCSVEERSfiSG 



6286 



1619 



SCKTENLLEMWWFQQGLSFLPSAliVIWTSAAFIFSYlTAVTLHIl' 
IDPALPY I SDTGTVAPE KCLFGAMLNI AAVLCIAT I YVRYKQVH 
ALSPEENVI I KLNKAGL VLG I L S CLGLS I VANFQKTTLFAAHVS 
GAVLTFGMGSLYMFVQT I LS YQMQPKI HGKQVFW IRLLLVI WCG 
VS ALSMLTCS SVLHSGNFGTDLEQKLHWNPEDKG YVLHMITTAA 
E WSMSFS FFGFFLT Y I RDFQKI SLRVEANLHGLTLYDTAPCP IN 
NERTRLLSRDI 



KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV" 
PFS I PAASE IADLSNI INKLLKDKNE FHKHVEFDFL I KGQFLRM 
PliDKHMEMENISSBEWE IE YVEKYTAPQPEQCMFHDDWISS I K 
GAEEWILTGSYDKTSRIWSLEGKSIMTIVGHTDVVKDVAWVKKD 
SLSCLLLSASMDQTILLWEWNVERNKVKALHCCRGHAGSVDS IA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEBDEMEESTNRPRKKQKT 
EQLGLTRTP I VTLSGHM EAVSS VL WS OAEE I CS AS WDKT r R VWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVSLSLTSHTGWVTSVKWSPTHEQQLISGSLDNlVKIiWDT 
RSCKAPLYDLAAHEPKVL3VDWTDTGLLLSGGADNKLYSYRYSP 
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Amino acid segment containing signal peptide - 
(A=Alanine, (^Cysteine, DsAspartic Acid, E« 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, lULysine, 
L=Leucine, M=Mechionine, N«=Asparagine, 
P=Proline, Q*Glutamine, R^Arginine, 
St»Serine, T«Threonine, V»Valine # 
W-Tryptophan, Y=Tyrosine, x-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
y sr •*iv*v«xc5ot.jLt*e insertion^ 








TTSHVGA 


6287 


27-8 

i 


1482 


mqfffnfqiglrstsgkekysgdag^L6dalqlflqcialdedf 
apaklqvqkilcdlliipenlkeglkesswsslpctknrpfdfhs 
vmeesqslneps pkqsee i pe vtsepvkgslnraqsaqs inste 
mparedclkrvs s e p vls vqbkgvllkr klslleqd vi vnedgr 

xvAJutuvUv'A * rNh. VCMrSLAYGDIPEELIDVSDFECSIjCMRLFFE 
PVTTPCGHSFCKNCLERCLDHAPYCPLCKESLKEYLADRRYCVT 
QLLEELIVKYLPDELSERKKIYDEETAELSHLTKNVPIFVCTMA 
YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 
G CMLQ IRNVH FLPDGRS WDT VGGKRFRVLKRGMKDG YCTAD I E 
YLEDV 


6288 






VTliYPCRGLVGNLIiLGASGMASGCKIGPS I LNSDltANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 
HMM VS KPEQWVKPMAVAGANQ YTFHLEATENPG ALIKD I RENGM • 
KVG LAI KPGTS VE YUVP WANQI DMAL VMT VE PGFGGQKFMEDMM 
PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMIVSGSAIM 
RS EDPRS VINLLRNVCSEAAQKRS LDR 


6289 ■ 


1 


743 


VTLYPCRGLVGNLIiLGASGMASGCKIGPS I LNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKOLGQDPFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM 
KVGLAI KPGTS VEYLAPWANQI DMALVMTVEPG FGGQKFMEDMM 
PKVHI^RTQFPSLDIEVIX^VGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


3 


185S 


TJjORWLLG VYETVAPTLACLPRPRLRRRRRRRRRRMI SRYTRKA 
VPQSLELKG I T KKALNHHP PP EKLEEIS PTSDSHEKDTSSQS KS 
DI TRESS FTSADTGN SLSAFPS YTGAG I STEGSSDFS WG YGELD 
QNATE KVQTMFTAI DELLYEQKLS VHTKS LQEECQQWTAS FPHL 
R1LGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFSSSYAHKASSIAKSSSFCSMERDEEDSIIVSEGIIEEY 
LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDS ESS CV 
LSELHPLVLPRVPQSKVLYITSNPMSLCCSASRHQPNVNDLLVHG 
MPLQPRNLSLMDKX.LDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLSYTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEILRGA 
RVPVAPDSLSSPSPTPLSRKNLLPPIGXAEVEHVSTVGPQRQMK 
rnwuooKAUiAVVlJJiPNiQQPQERIjIjIiPDFFPRPNTTQSFLLDT 
QYRRSCAVEYPHQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 
P 


6291 


1732 


602 


lvakmassasartpagkrvii^qeelrrlmkekqrLstsrkries 

PFAKYNRLGQLS CALCNTPVKSELLWQTHVLGKQHREKVAELKG 
AKEASQGSSASSAPQSVKRKAPDADDQDVKRAKATIiVPQVQPST 
o/iyvi AiMr uMURCf IKAxVoitt'bCaljbliljPDxEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDQMDKEWDEFQKAMRQVNTISEAIVAEEDEEGRLDRQ 
IGEIDEQIBCYRRVEKLRNRQDEIKNKLKEILTIKELQKKEEEN 
ADSDDEGELQDLLSQ DWRVKGALL 


6292 


1835 


1142 


TCPGAMKMVAPWTRF YSNSC CLCCHVRTGT I IiLGVW YLI INAW 
LLIIiLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
LICAMATYGAYKQRAAWI IPFFCYQIFDFALNMLVAITVLIYPN 
S IQEYIRQIiPPNFP YRDDVMS VNPTCLVliI I LLFIS I ILTFKGY 
LIS CVWNCYRYI NGRNSSDVLVYVTSNDTTVLLPP YDDATVNGA 
AKEPPPPYVSA 


6293 


2382 


1035 


FWCTLGTVDVHPIGWCAINSKILVPPRTIHAKFTDWKGYLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLEVVDKSQVSRTRMAVV 
DTVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, KaLysine, 
L=*Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R*Arginine, 
o-ociine, i = inreonme, V»valine, 
WoTryptophan, Y=»Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 
LEAIDPLNLGNI CVATVCKVLLDGYLMI CVDGGPSTDGLDWFCY 
HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 
APS RLFNM DCPNHG FKVGMKLEAVDLMEPRL1 CVATVKRWHRL 
LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 
PQGAR KISSE P VPGE 1 1 AVRVKEEHLDVAS PDKASS PELPVS VE 
NIKQETDD 


6*294 


354 


1814 


AQLTTRGRTVAGG VRW I PSPFPDLELYS CCLGTDkGFPELSHHC 
KNVIATAS DYDMAE I TNIRPS FDVS PWAGLI GASVLWCVSVT 
VF VWSCCHQQAEKKHKNPP YKFIHMLKG IS I YPETLSNKKKI 1 K 
VRRDKDGPGREGGRRNLLVDAAEAGLIiSRDXDPRGPSSGSCIDQ 
LPIKMDYGEELRSPITSLTPGESKTTSPSSPEEDVMLGSLTFSV 
DYNFPKKALVVTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 
RDDVTGE VMVPLAGVD PSTGKVQLTRDI IKRNIQKCISRGELQV 
S L S YQPVAQ R MT VWLKARH LQKMD I AGLS GKTP YVKVNVY YGR K 
RIAKKKTHVKKCTLNPIFNESFIYDIPTDLLPDISIEFLVIDFD 
RTTKNEWG RLI I/3AHS VTASGAEH WREVCE3 PRKPVAKWHSLS 
EY 


6295 


279S 


617 


VSSAIiLTGATSGSDAAKSEGASAS PLSCTNAVAM0RPDEGPPAK 
TRR LS SSES PQRDPP PP P P PPFLLRLPLPPPQQRPRLQEETEAA 
QVIiADMRGVGliGPALPP PPP Y V I L EEGGI RAYFTLGAE CPGWDS 
TIESGYGEAPPPTESLEALPTPBASGGSLEIDFQWQSSSFGGE 
GALE TCS AVGW APQRLVD P KS KEEAI 1 1 VEDEDEDSRESMRSSR 
RRRRRRRRKQRKVKRBS RERNABRMES ILQALEDIQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDLIIQHIPGFWVKAFLNHPR 
ISILINRRDEDIFRYLTNLQVQDLRHISMGYKMKLYFQTNPYFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHSLPEADRIAEI IKNDLWVNPLRYYLRERGSRIKRKK 
QEMKKRKTRGRCE WIMEDAPDYYAVEDI FSEI SDIDETIHDI K 
ISDFMETTDYFETTDNBITDINENICDSENPDHNEVPNNETTDN 
NESADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNENTY 
GNNFFKGG FWGSHGNNQDS S DSDNEADE ASDDEDNDGNEQDNEG 
SDDDGNEGDNEGSDDDDRDIEYYEKVI EDFDKDQADYEDVIEI I 

SDESVEEEGIEEGIQQDEDIYEEGNYEEEGSEDVWEEGEDSDDS 
DTiRfiVTiOVDMf'SW AMDrtTrDr* v*rva 


629$ 


727 


1199 


RHCGCI1AQGACDSLPPTGTSSPWARNAI PEARCCVWLLDGTTV J 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSIiQFPSPFSGTISFGSFSDSJSIFPLGSQCCLGFQQFSISGK 
KWAL IHKRVRLSVFGARWGRI YFGK * 1 


6297 


1 


922 


QRAAAAS PSSCX3PRGAE YGALMAMEGYWRFLALLGS AblJVGFLS 
VI FALVWVLHYREGLGWIXSSALEFNWHPVI^VTGFVFIQGIAI I 
VYRLPWTWKCS KLLMKS I HAGLNAVAAI LAI I S WAVFENHNVN 
N I ANMYSLHS WVGL IAVI C YLLQLLSGFS VFLL P WAPLS LRAFL 
MP I HVYSG I VI FGTVIATALMGLTEKL I FSLRDPAYS TFPPEG V 
FVNTLGLLILVPGALI FWI VTRPQWKRPXBPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNNEVAARKRHLALDEAGQRSTM 


6298 


3 


985 


SVPLRRLSLSGTLQGAGTTTKMAVARLAAVAAWVPCRSWGWAAV 
PFGPHRGLS VLLAR I PQRAPR WLPACRQ KTSLS FLNR PDLPNLA 
YKKLKG KS PGI I FI PGYLSYMNGTKALA I EEFCKSLGHACI RFD 
YS GVGS SDGNSE ESTLGKWRKDVLS 1 1 DDLADGPQ I LVGS S LGG 
VJLMLHAAIARPEKVVALIGVATAADTLVTKFNQLPVELKKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRL 
LHGMKDDI VPWHTS MQVAJDRVLSTDVDV1 LRKHSDHRNRE KAD I 
QLLVYTIDDLIDKLSTIVN | 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


; Predicted end 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=K3lycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
PoProline, Q=Glutamine, R=Arginine, 
S=serine, ToThreonine, V^Valine, 
W^Tryptophan, Y*Tyrosine, X -Unknown, '-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


6299 


512 


614 


BCDIiEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSAILTSS 
SIDAMDDSAFSGPYKFPFTPPLESPNLCFYTSQVPVPPILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPSCWSQRGVPAAGTPSSPRLLVSRAAAPSAGPWGAWRQGARA 
AQS P FS I PNSSS VPYGS QDS VHSS PEDGGGGRDR P VGGS PGGPR 
LVIGSLPAHLSPHMFGGPKC3PVCSKFVSSDEMDLHLVMCLTKPR 
ITYNEDVLSKDAGECAICLEEIiQQGDTIARLPCLCIYHKGCXDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKYLRCYRCLLETKELGCtLdibteLTP 
AGSSCITLHKKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGFW 
IFSQYCFLDFCNDPQNRGLYTP 


6302 


490 


745 


IFGFLHLFHMEHSFLLVCALFAHVPFSSSCGSSVALHSDPCLLS 
PVLLNCIiPGDLRPLDELYAQKLKYKAI SEELDHALNDMTSL 


6303 


2 


1961 


YKNEYGGGLLWQSWQEKHPGQALSSBPWNFPDTKEEWEQHYSQiT" 
YWYYLEQFQYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 

TfVmT«'\/CI?T,CCDTMf2rYWT%OCr*TCr\VftueniTT TIPTOmt irr Mpnennn 

a vjjjj v a c Lioo tf xn^>ueiut>t>\3 1 bU&UH i>c. 1 JjJJCsloNLKLiNoEEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PS KLKRSHELD I DENPASDFDDSGSLLGFKYGSGQKYGG I PNFS 
HRQVRYIiEKNVKLKSK YLDMRRQIKMKNKHIFFTKESEKP FFKK 
SKILSKVEKFLTWVNKPMDEEASQES SSHDNGHDAS TSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYERDSLLATV 
PDEODCVTQEVPDSRQABTEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDGIKLDREGWFSVTPEKIAEHIAGR 
VSQS FXCDVWDAFCGVGGNT I QFALTGMR VI AID I DPVK I ALA 
RNNAEVYG I ADK I E FI CGDFLLLAS FLKAD WFLS P P WGGPD YA 

LAGPGGQVEIEQOTIiNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


HRARVDRSRESPGGDLRHPGRVRRDITLSGHPRLSTQHWLIjRE" 

DEVGDPGTKDIX3HPQHGSPIQETQSEVVTLVSPLPGSDMAALPA 

WRATSGLTL^HTAEGRDLLGAENRALTGGQQAEJDPTLASGAYQ 

WPGSVEKLQGSW/CDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 

AQGEWDKARVPAHGQVLQVGFSTEAALQDLSSPRLSQLCSQGlj 

CGLIKR PGDLPEVLS FHVDR VLGLRRS L PAVARRFHS PIiLPYRY 

TDGGAR PVI WWAPDVQHLSDPDBDO^SI*AliGMLQYQALLAHS CN 

WPGQAP CPGIHHTE WARWutiFDFIjLQVHDRIiDRYCCG FEPEPSD 

PCVEE RLREKCRNP AE LRLVH I LVRS S D PSHLVYI DNAGNLQH P 

EDKLNFRLLEGIIX5F?BSAVKV1^GCLQNMLLKSU)MDPVFWE 

SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


99 


420 


NMIWRGRSTYRPRPRRSVPPPEI.IGPMLEPGDEEPQQEEPPTES 
RDPAPGQEREEXXJGAAETQVPDLEADLQELSQSKTGDECGDGPD 
VQGKILTKSEQFKMPEGR 


6306 


X 


1874 


PTRPS KVICVP HTFL IHS YTRPT VCQACKKLLKGi7raG^3LQCKDC 
KFNCH KRCATRV PNDCLGEAL I NGD VPME EATDFSE ADKS ALMD 
ESSDSGVI PGSHSENALHASEEEEGEGGKAQSSLGYIPLMRVVQ 
SVRHTTRKSSTTLREGWVVHYSNKDTLRKRITYWRLDCKCITLFQ 
NNTTNR Y YKE I PLSE I LT VES AQKFSLVPPGTNPHC FE I VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQAS LSI S VSNSQ I QENVDIATVYQI FPDB VLGSGQF 
GVVYGGKHRKTGRDVAVKVIDKLRFPTKQBSQLRNEVAIIiQSLR 
HPGIVNLECWFETPEKVFVVMErajHGDMLEMILSSEKGRLPERL 
TKFLITQILVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFARI IGEKSFRRSWGTPAY1JVPEVLLNQGYNRSLDMWSVG 
VIMYVSLSGTFPFNEDEDINDQ IQNAAFMYPASPWSHI SAGAID 
LINI^QVKMRKRYSVDKSI*SHPWLQEYQTWI*DLRELEGKMGER 
Y ITHESDDAR WEQFAAEH PLPGSGL PTDRDIjGGACPPQDHDMQG 
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ID 
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nucleotide 
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corresponding 
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amino acid 
sequence 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H-Histidine, I»Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LAERISVL 


6307 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKVVRQSKFHHVFG 
QPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFL 
VliPLSKTGRIDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDC 
TVMVWQ I PENGLTS PLTE P VWLEGHTKRVG 1 1 AWHPTARNVLL 
SAGCDNWIiIWNVGTAEEL YRLDS LHPDL I YNVS WNHNGS LFC S 
ACKDKSVRI IDPRRGTLVAEREKAHEGARPMRAIFLADGKVFTT 
GFSRMSERQLALWDPENLEBPMALQELDSSNGAIjLPFYDPDTSV 
VYVCGKGDSS1RYFEITEEPPYIHFLNTFTSKEPQRGMGSMPKR 
GLEVS KCE IARFYKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PEAALEAEEWVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DSRPAMAPGS2HLGAPASTTTAADATPSGSLARAGEAGKLEEVM 
QELRALRALVKEQGDRICRLEEQLGRMENGDA 


6308 


2 


1118 


GRPTRPEKMLXiS lvlht ys mryll PS wllgt apt yvlawg vwr 

LLSAFLPARFYQALDDRLYCVYQSMVLFFFENYTGVQILIiYGDL 
PKNKENIIYIANHQSTVDWIVADIIiAIRQNALGHVRYVLKEGLK 
WLPLYGWYFAQHGG I YVKRS AKFNEKEMRNKLQS YVDAGTPM YL 
VI FPEGTR YOTEG/TKVIiS ASQAFAAQRGLAVLKHVLTPR I KATH 
VAFDCMKNYLDAIYDVTVVYEGKDDGGQRRESPTMTEFLCKECP 
KIHI H I DRIDKKDVPEEQEHMRR WLHER FBI KDKMLIEF YES PD 
PERRKRFPGKSVNSKLSIKKTLPSMLIIiSGLTAGMLMTDAGRKL 
YVNTW I YGTLLGCXiWVTI KA 


6309 


220 


563 


LVAEVKEPCSLPMIiSVDMENKENGSVGVKNSMENGRPPDPADWA 
VMDVVNYFRTVGFEEO^SAFQEQEIDGKSLLLMTRNDVLTGLQL 
KLGPALKIYEYHVKPLQTKHLKNNSS 


6310 


36 


979 


G PRCW KFLI LSS VN CETIiR IGKAW PQ£ SGQER YWTPRTHS S AS 3 
AQRGSLA3LNVAAAGLWADCDQPLYDCPMCGLI CTITYHILQEHV 
DLHLBENS FQQGMDRVQCSGDLQLAIIQLQQEEDRKRRS EES RQE 
1EEFQKLQRQYGLDNSGGYKQQQLRNMEIEVNRGRMPPSEFHRR 
KADMMESLALGFDDGKTKTSGI I EALHR YYQNAATDVRRVWLS S 
WDHFHSSLGDKGWGCGYRNFQMLLSSKLQNDAYNDCbKGMLIP 
CIPKIQSMIEDAWKEGFDPQGASQLIIRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


P VW WNS CEG PRIAAAARTGHG VGRRARLACLGE PRVXAAVWUTL 
AS KLKRDDGLKGSRTAATASDSTRRVSVRDKLLVKEVAELEANL 
PCTCKVHFPDPNKLHCFQLTVTPDEGYYQGGKFQFETEVPDAYN 
MV?PKVKCI»TKIWHPNITETGB I CLS LIiREHS IDGTGWAPTRTL 
KDVVWGLNSLFTDLLNFDDPLNIEAAEHHLRDKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKMLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTEEEAKQLAEEMNIAFYTSRTDDILLHQDVDLVCIS I PPPLT 
RQI S VKAIX5 IGKNVVCEKAATS VDAFRMVTAS RYYPQLMS LVGN 
VLRFLPAFVRMKQLISEHYVGAVMICDARIYSGSLLSPSYGWIC 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
I RG I RHVTS DD FC FFQMLMGGG VCS TVTLNFNM PGA F VH E VMW 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGLPEQGPQD 
VPLLYLKGMVYMVQA1JIQSFQGQGDRRTWDRTPVSP4AASFEDGL 
YMQSVVDAIKRSSRSGEWEAVBVLTEEPDTNQNLCEALQRNNL 


6313 


2 


2071 


QRSGAARLAFLPSPFS PACVHRSPLS FHGCKFYFVWFMPLGVL 
FHRRRAHGC TLS CSSFVEQ PTAMEAEETMECLQE FPEHHKM I LD 
RLNEQREQDRFTDITL I VDGHH FKAHKAVLAACS KFFYKFFQEF 
TQEPLVEI EGVSKMAFRHLI EFTYTAKLMIQGEEBANDVWKAAE 
FLQMLEAI KALE VRNKENSAPLEENTTGKNBAKKRKI AETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
Y I QSTGS SDDS ALALLAD I TS K YRQGDRKGQ I KEDGCPS D PTS K 
QVEG I E I VELQLS HVKDL FHCE KCNRS FKLP YHFKEHMKSHSTE 
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Amino acid segment containing signal peptide 
(A»Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*= Phenyl alanine, G=Glycine, 
H=Histidine f I=Isoleucine, K=Iiysine, 
Ii=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

W=*Tryptophan, Y«Tyrosine, X«Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\«=possible nucleotide insertion) 








SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
Q YCE KQFDHFGHFKEHLRKHTGEKPFECPNCHER FARNS TL KCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMTI IEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDS HMSELPEQVQV3 YLEVGRIQTBEGTEVHVEEIiHVE RVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


QRSGAARIiAFLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPBHHKMILD 
RLNEQREQDRFTDITL1VDGHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKLMIQGBEEANDVWKAAE 
FLQMLEAI KALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSAIALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
Q VEG I E I VELQLSHVKDLFHCEKCNRS FKLF YH F KEHMKS HSTE 
SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQ FDHFGHFKEHLRKHTGEKPFECPNCHER FARNS TLKCH 
LTACQ TG VGAKKGRKKLYE CQ VCNSVFNS WDQ F KDH L VI HTGDK 
PNHCTLCDLWFM(^NELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMTI I E QVGKVHVLPLLQVQVDSAQ VTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMMQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVIjE 


6315 


1 


1015 


LGIAVNVVTTJjVtilSYCPTATEEAPYWTYLLCAIiGLFilYQSLDA - 
IIX3KQARRTNSCSPLGELFDHGCDSLSTVFMAVGAS IAARLGTY 
PDWFF S CSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQI ALVI 
VFVIiSAFGGATMWDYTIPILEIKLKIIjPVLGFIiGGVI fscsnyf 
HV1LHGGVGKNGST IAGTSVLS PGLH IGLX 1 1 LA I M I YKKS ATD 
VFEKHPCXYILMFXSC^FAKVSQKLWAHOTKSELYLQmVFIiGP 
GLLFLDQYFNNF IDEYWLWMAMVISSFDMVI YFSALCLQ isrh 
LHLNI FKTACHQAPEQVQVLSSKSHQNNMD 


6316 


1503 


"'"792 ' 


VSAGAGTG IMGGTTSTRRVTFEADENEN I TVVKGIRLS ENV I DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAXELD RERAAANEQLTRAI LRERI CSEEERAKAKHL 
ARQLEEKORVLKKQDAF YKEQIiARLEERSSE FYRVTTEQYQKAA 
EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLEKGG 


6317 


102 


839 


PEAOrSAVUU^KGHLPTMRHEAPMQMASXQbARYGQKDSStbQM 
FDYMF KLLI IGNS SVGKTS FLFRYADDS FTS AFVS T VG I DFKVK 
i VI? ANfcKK J.KLQIWDTAGQERYRTITTAYYRGAMGFIIjMYDITN 
EE SFNAVQD WS TQI KT YSWDNAQVILVGNKCDMEDERVI STERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAITAAKQNTRIjKETPPPPQPNCAC 


6316 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGRSGADSMSHLPGLELRREAPPL 
LGPLLS P FPLPAGS WHRQMLRSSLRFPI TN5AGAPCXAAGRMNI 
IAPVRRDRVLAELPQCLRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLGIPFSLQLWDTAGQERFKCIASTYYRGAQAI I IVFNLN 
DVASLEHTKQWLADALKENDPSSVLLFLVGSKKDLSTPAQYALM 
E KDALQVAQEMKAE YWAVSSLTG ENVR EFF FRVAAIiTFEANVLA 
E LEKSGAR R IGDWR INSDDSNLYLTASKKKPTCCP 


6319 


88 


717 

— 


AATMRLNQNTLLLGKKWLVPYTSEHVPSRYHEWMKSEELQRLT 
ASEP LTLEQE YAMQCS WQEDADKCTFI VLDAEKWQAQPGATE ES 
CMVGDVNLFLTDLEDLTLGEI EVM I ABPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKrGQGNEPSIRMFQKLHFEQVATSSVFQEV 
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Amino acid segment containing signal peptide 
{A«Alanine, C-Cyeteine, D«Aspartic Acid, B= 
Glutamic Acid, F«Phenyl alanine, G*Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutaraine, R«Arginine, 
SaSerine, T»Threonine, V« Valine, 
Wc, Tryptophan, Y«*Tyrosine, X«* Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESBHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


i mi 


RPRTGRBKVAMAAVDS FYLLYRE I ARSCNCYMEALALVGAWYTA 
RKSITVI CDFYSLI RLHFIPRLGSRADLIKQ YGRWAWSGATDG 
IGKAYABELASRGLNI ILISRNEEKLQWAKDIADTYKVETDII 
VADPSSGRE I YLP I REALKDKD VG 1 LVNNVGVFYP YPQYFTQLS 
EDKLWDI INVNIAAASLMVHWLPGMVERKKGAIVTISSGSCCK 
PT PQLAAFSASKAYLDHFS RALQYE YASKG 1 FVQSL I P F YVATS 
MTAPSNFLHRCSWLVPSPKVYAHHAVSTLGIS KRTTGYWSHS IQ 
FL FAQYM PEWLWVWGANI LNRSLRKEALSCTA 


6321 


1418 


341 


HRKAALGALMAGRLLG tCALAAVSLS LALAS VT X RS S RCRG IQAP 
RNS FS SS W FHLNTNVMS GS NGS KENSHNKARTS P YPGSKVERS Q 
VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNFSPK 
FNEKDGH VER KS KNGLYE I ENGR P RNP AGRTGLVGRGLLGRWG P 
NHAADPIITRWKRDSSGNKIMHP7SGKHILQFVAIKRKDCXJEWA 
IPGGMVDPGEKISATLKREFGEBALNSLQKTSAEKREIBEKLHK 
LFSQDHLVIYKGWDDPRNTDNAWMETEAVNYHDETGEIMDNLM 
LEAGDDAGKVKWVDINDKLKLYASHSQFIKLVAEKRDAHWSEDS 
EADCHAL 


6~322 


2047 


1083 


NQE I LKNVESSRTVQPHFLEFIiLS LOWS VD VGRHPGWTGHVSTS 
WS 1NCCDDGEGSQQEE VISSEDIGASI FNGQKKVLYYADAIiTEI 
AFWPSPVESLTDSLESNISDQDSDSNMDLMPGILKQPSLTLEL 
FPNHTDNLNS SQRLSPSSRMRKLPQGRPVPPLGPETRVSVVWVE 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTLEKEVPVIF 
IHPLNTGLFRIKIOGATGKFNMVIPLVDGMIVSRRALGFLVRQT 
VINICRRKRIiESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PASTTDGAQEARVPLDGAF W I PRP PAGSPKGCFACVSKPPALQA 
P AAPAPEPS AS P PMAPTLFPMES KS S KTDSVRAAGAP PACKHLA 
E KKTMTNPTTVI E VY PDTTEVNDY YLWS IFNFVYLNFCCLG FI A 
LAY S LKVRDKKLLNDLNG AVEDAKTDRLIN ITRS GLAASC I MLW 
MALS VIATHRGLRSSAS ILVAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARKGLEAPRGQRRRQPGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAE FWTDGQTEPAAAGLG VETERPKQKTEPDRS S LRTHLE WS W 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEAS PWTQPGVHGP 
WTELETHGSQTQPER VKS WADNLWTHQNSSSLQTHPEGACPS KE 
PSADGSWKELYTDGSRTQQDIEGPWTEPYTDGSQKKQDTEAARK 
QPGTGGFQ I QQDTDGS WTQPSTDGSQTAPGTDCLLGEPEDG PLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFSS AS S FDESEDDWAGGGGASDPEDRSGS KP WKKL 
KTVLKYSPFWS FRKH YP WVQLSGHAGNFQAGBDGR I LKRFCQC 
EQRSLEQLM KDP LRPFVPAYYGMVLQDGQTFNQMEDLLADFEGP 
SIMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 
EEHAQGAVTKPRYMQWRETMSSTSTLGFRIEGI KKADGTCNTNF 
KKTQALEQVTKVLEDFVDGDHVI LQKYVACLEELREALE I S P FF 
KTHEVVGSSLLFVHDHTGLAKVtWIDFGKTVALPDHQTLSHRLP 
WAEGNREDG YLWGLDNM I CLLOGLAQS 


632S 


165 


944 


GLRD P FRRKRRLKPQ VKMSN YVNDM WPGSPQEKDS PS TSR5 GGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRI>SEKDRMEIjLEIAKTNAAKAIjGTTNIDLPASLRT 
VPSAKETSRG IGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 


geps patqqkpsatgagvlhqhfssghi yvlmgllpp pwt IS FT 
vqttlqppgglpaapvsgrmafepvgrdiiarrmvpragkrtqtl 
garrvaaqgarplpedrrpksgerlhvtvapcwefvlpsvslta 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
{A«Alanine, C=Cysteine, D=Aepartic Acid, E«* 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N= Asparagine , 
PsProline, QaGlutamine, RoArginine, 
S«Serine, T-Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknovro, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








QAWGGVGQBASSGVP 


6327 


; 1 


1337 


SliARIiAPAGGSWMPTQQPAAPSTRAPKPSR^LSGSijCALFSDA 
DSGSGMKAELPPGPGAVGREMTKEBKLQLRKEKKQQKKKRKEEK 
GAEPETGS AVS AAQ CQGPTRELPESG I QLGTPRE KVP AGRS KAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RQNSLTQFMSIPSSVIHPAMVRLGIiOYSQGLVRGSMARClALLR 
ALQQV I QDYTTP PNEELS RDL VNKLKP YMS FLTQCR P LSASMHN 
AIKFLNKEITSVGSSKREEEAKSELRAAIDRYVQEKIVLAAQAI 
S R FAYQK I SNGD VI LVYGCS S LVSRILQEAWTEGRRFRVWVDS 
R PWLEG RHTLRS L VHAGVP AS YLL I PAAS YVLPEVS TEE KDS KV 
GGEJCV 


6326 


1030 


276 


HASAEVTTAAARGLGAMEEEMHTDAKIRAENGTGSSPRGPGCSli 
RHFACEQNLIjS RPDGSASFLQGDTS VLAGVYGPABVKVSKE I FN 
KATLEVILRPKIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSI 

TWtiQWS dagsllacclnaacmalvdagvpmralfcgvacai*d 

SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


6329 


3 


2016 


SS EVAAGGGTRS AMAEGSGE VVT VS ATGAANGLNNGAGGTSATT 
SNPLSRKLHKI LETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDI ERKS LAINEEFVS IFKEVKEELES IS EDVQAMSNCCQDMT 
S RLQAAKEQTQDL I VKTTKLQSESQKLE I RAQVABAFLS KFQLT 
S DEMS LLRGTREG P I TEDFTKALGRVKQ IHNDVKVLLRTNQO/IA 
GLEIMEQMALLQETAYERLYRWAQSECRTLTQESCDVSPVLTQA 
ME ALQDRP VLYKYTLDE FGTARRSTWRGF IDALTRGG PGGTPR 
PI EMHSHDPLRYVGDMLAWLHO/ATASEKEHLEAIjLKHVTTQGVE 
ENIQEWGHITEGVCRPLKVRIEQVIVAEPGAVLLYKISNLLKF 
YHHTISGIVGWSATALLTTIEEMHLLSKKIFFNSLSLHASKLMD 
KVEL PP PDI^PS SAIINQTLMLLRE VLASHDS S WPLDARQADFV 
QVLS CVLD PLUJMCTVSASNLGTADMATFM VNS LYMM KTTLALF 
E FTDRRLEMLQFQI EAHLDTL INEQAS YVLTRVGLS Y 1 YNTVQQ 
HKPEQGSIANMPNLDSVTLKAAMVQFDRYLSAPDNLL I PQLNFL 
L S ATVKEQ 1 VKQSTBLVCRAYGE VYAAVMNP INEYKDPENI LHR 
SPQQVQTLLS 


6330 


1151 


333 


FFY YTF YENKTFSRKMVAEKETIiSIjNKC PDKMPKRTKLLAQQFL 
PVHQPHSLVSEGFTVKAMMKNSVVRGPPAAGAFKERPTKPTAFR 
KFYERGDFPIALEHDSKGNKIAWKVEIBKLDYHHYLPLFFDGLC 
EMTFP YEFFARQGIHDMLEHGGNKILPVLPQLI IPIKNALNLRN 
RQVI CVTLKVLQHLWS AEMVGKALVP YYRQ I LP VLNI FKNMNV 
NSGDGIDYSQQKRENIGDLIQETLEAFERYGGBNAFINIKYWP 
TYESCLLN 


6331 


3 


49S 


QQGQRVRTRGRRAC^ATPbEQCVDLSYPRTHAALLKVAQMVTL 
LIAFICVRSSLWTNYSAYSYFEVVTICDLIMILAFYLVHLFRFY 
RVLTClSWPLSELLHYLIGTLLLLIASIVAASKSYNQSGIiVAGA 
I FGFMATFLCMASIWLS YKISCVTQSTDAAV 


6335 


1 


878 


.VTESNKFDLVSFIPIiLRERIYSNNQYARQFlIlSWILVLliSVPDI 
NLLDYLPEILDGLFQILGDNGKEIRKMCEWLGEFLKEIKKNPS 
S VKFAE MAN I LVIHCQTTDDLIQLTAMCWMREF I QLAGR VMLP Y 
S S G I LTAVL P CLiAYD DRKKS IKEVAKVCNQSLMKLVTPEDDELD 
BLRPGQRQABPTPDDALPKQEGTASGEWTPSLHLTSCRGPREPD 
VI GVALGP HLSNQDYFMYVTHT I VAATQ RSGSSGS PP FCRQDTG 
KLSTMATHSQLVKTGTGLE PRQAVS S SH 


6333 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLLAPLIPPRSAG 
QPLTFS PSGRQPLRSLLVGMCSGSGRRRSSLS PTMRPGTGAERG 
GIWMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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location 
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to first 
amino acid 
residue of 
amino acid 
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location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, C=Cyeteine, D=Aspartic Acid; E= 
Glutamic Aeid / F« Phenylalanine, G=Glycine, 
H-Histidine, I«Isoleucine, X=Lysine, 
LsLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=*Glutamine , R«Arginine, 
S=*Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








GAKSMWTEHKSPDGRTYYYNTBTKQSTWBKPDDLKTPAEQLLSK 
CPWKSyKSDSGKPYYYNSQTKESRWAKPKELEDLEGYQNTIVAG 
SLITKSNIiHAMIKAEESSKQEECTTTSTAPVPTTEI PTTMSTMA 
AAEAAAAWAAAAAAAAAAAAANANASTSASNTVSGTVPVVPBP 
E VTS I VAT WDNENTVTI STEEQAQLTST PAIQDQSVEVS SNTG 
BETS KQETVAD FTP KKEE EESQ P AKKTYT WNT KE EAKQ AF ECELL 
KBKRVPSNASWEQAMKMI INDPR YSAIiAKLS EKKQAFNAYKVQT 
EKK 


6334 


. 17 


644 


GGNPSGRAAGFAAAAMPSSPLRVAWCSSNQNRSMEAHN1LSKR 
G FS VR5 FGTGTHVKLPG PAPDK PNVYDFKTT YDQMYNDLLRKDK 
ELYTQNGII.HMLDRNKRIKPRPERPQNCKJDLFDLILTCEERVYD 
QWEDLNSREQETCQPVH WNVDIQDNHEEATLGAFL ICELCQC 
IQHTEDMENBIDELLQBFEEKSGRTFLHTVCPY 


6335 


82 


529 


AARAR PG VLCCRIitiGAALGDQSRVEMS YI PGQ P VTAWQRVE IH 
KLRQGENLILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
GPAE I AGLQ I GDKIMQVNGWDWTMVTHDQARKRLTKRS EE WRL 
LVTRQS LQKAVQQSMLS 


6336 " 


1003 


438 


HEPASKGRAEVGNMRL6VAAAISHGRVFRRMGLGPESRIHLLRN 
LLTGL VRHERI EAP WARVDEMRG YAE KL IDYG KLGDTNERAMRM 
AD FWLTEKDL I PKLPQVIAPRYKDQTGGYTRMLQ I PNRSLDRAK 

MAVIEYKGNCLPPLPLPRRDSHLTLLNQLLQGLRQDLRQSQEAS 
NHSSHTAQTPGI 


6337 


76 


524 


EGIQMLSVQPDTKPKGCAGCNRKIKDRYLLKALDKYWHEDCLKC 
ACCDCRLGEVGSTLYTKANLILCRRJDYLRLFGVTGNCAACSKLI 
PA FE MVMRAKDNVYHLDC FACQLCNQR FCVGD KF FLKNNM I LCQ 
TDYBEGLMKEGYAPQVR 


633B 


66 


1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP^ 
GLRLALliLLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVNAKNY 
KNVFKKYEVLALLYHEPPEDDKASQRQFEMEELILEIiAAQVIJED 
KGVGFGLTOSEKDAAVAKKLGLTEVDSMYVFKGDEVIEYDGEFS 
ADTIVEFIiLDVLEDPVEIjI EGERELQAFENI EDE I KLIGYFKS K 
DSEH YKAFEDAAEEFHPYI PFFATFDS KGAKKLTLKLNE I D FYE 
AFMEEPVTIPDKPNSEEEIVNFVEEHRRSTLRKLKPESMYETWE 
DDMDG IHI VAFAEEADPDGFE FLETLKAVAQDMTENPDLS I IWI 
DPDD FP LLVPY WE KTFDI DLS APQ I G WNVTDADRLWMEMDDE E 
DLPSAEELEDWIiEDVLEGE INTEDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSQGAMKAFH 
TFCWLLVFGSVSEAKFDDFEDEED I VE YI3DNDFAEFEDVMEDS 
VTESPQRVIITEDDEDBTTVELBGQDENQEGDFEDADTQEGDTE 
SEPYDDEEFEGYEDKPDTSSSlQnCDPITIVDVPAHLQNSWESYY 
LEILMVTGIiLAY IMNYI IGKNKNSRLAQAWFNTHRELLESNFTL 
VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCEGMIjIQLRFL 
KRQDLLNVLARMMRPVSDQVQ I KVTMNDEDMDTYVFAVGTRKAL 
VRLQKEMQDLSEFCSDKPKSGAKYGIiPDSLAILSEMGEVTDGMM 
DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPLKLPDTKR 
TLLLTFNVPGSGNTYPKDMEALLPLMNMVIYSIDKAKKFRLNRE 
GKQKAD KNRARVE ENFL KLTHVQRQEAAQS R RE E KKRAE KER I M 
NEEDPEKQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 




2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTSSTFRAERS FHSSSSS 
S S S STS SS AS RALPAQDPPME KAL S MFS DDFG5 FMRPHSEP LAF 
PARPGGAGNIKTLGDAYEFAVDVRDFSPEDI I VTTSNNH I E VRA 
EKLAADGTVMNNFAHKCQLPEDVDPTSVTSALREDGSLTIRARR 
HPHTEHVQQTFRTEIKI 


6341 


2 


645 


KMAVLSAPGLRGFR I LGLRES VG PAVQARGVHQS VATDGPSSTQ 
PALPKARAVAPKPS SRGE YWAKLDDLVNWARRSSLWPMTFGLA 
CCAVEMMHMAAPRYDMDRFGVVFRASPRQSDVMXVAGTIiTNKMA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, Fo Phenyl alanine, G=»Glycine, 
H=Histidine, I-Isoleucine, KoLysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine , V*Valine, 
W*Tryptophan, YoTyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\wpossible nucleotide insertion) 








PALRKVYDQMPEPRYWSMGSCANGGGYyHYSYSWRGCDRIVP 
VDIYIPGCPPTAEALLYGILQLQRKIKRBRRLQIWYRR 


6342 
*343 


2 


| 1191 


DPRVF^LATI^VAAIJ^IHJLFSGRGGGRGLWTORPQSDMNNI 
KPLEGVK1 LDLTRVLAG PFATMNLGDLGAEVI KVERPGAGDDTR 
TWGP PFVGTESTYYLS VNRNKKS I AVNIKDPKGVKI I KELAAVC 
D VFVENYVPG KLS AMGLG YEDIDE I APH 1 1 YCS ITG YGQTGP I S 
ORAGYDAVASAVSGLMHITGPBVACIiSHIAANYLIGQKEAKRWG 
TAHGS IVPYQAFKTKDGYI WGAGNNQQFATVCKILDLPELIDN 
S KYKTNHLRVHNRKEL I KI L&ERFEE ELTS KWL YLFEGSGVPYG 
P INNMKEJVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
MSEARPFPLLGQHTTHILKEVLRYDDRAIGELLSAGWDQHETH 




2 


936 


GTAMVSDEDEX.NLLVIWDANPIWWGKQALKESQFTLSKCIDAV 
M VLGNSHLFMRRSNKLAVI ASHI Q ESRFLYPGKNGRLGD FFGDP 
GNPPEFNPSGS KDGKYBLLTSANEVI VEEI KDLMTKSDIKGQHT 
ETLLAGSIAKALCnriHRra^KEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMNVIFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKV 
PQMPS LLQYLLW VFLPDQDQRSQLILP PPVHVD YRAACFCHRNL 
1EIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAKKKKLK 
VSA 


6344 
6345 


2508 


147 " 


TMPTATI^NI*RGYGMAS PGLAAPSLTP PQLAT PNLQQFFPQ ATR 
QSrJjGPPPVGVP!<OJPSQFNLSGRNPQKGARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEAS ELPAKRLRS SEE PTEKEPPGQLQVKAQPQ ARMT 
VPKQTQTPDL^PEALEAQVLPRFQPRVLQVQAQVQSQTQPRIPS 
TDTQVQPKLQ KQAQTQTSPEHLVLQQKQVQPQLQQSAEPQKQVQ 
PQVQPOAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVCJTQTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQ PQVSLXiAPEQTPVVVHVCGLEMP PDAVE AGGGMEKTLPEP 
VGTQ VS MEE I QNE S AOGLDVGECENRAR EMPG VWGAGGS LKVT I 
LQSSDSRAFSTVPLTPVPRPSDSVS5TPAATSTPSKQALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRLGEIQHMSQACLLSLLPVPR 
DVLETEDEE PPPRRWCNTCQLYYMGDLI QHRRTQDHKI AKQSLR 
PFCTVCOTYFKTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGS ETYS PNTAYGVDFLVP VMG YICRICHKFYHSNSGAQL 
SHCKSLGHFENLQKYKAAKNPSPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 




2 


3483 


PR VRTKL I LLVNDKKRYER VG GG PKRLGRDVEMEEM I EQLQE KV 

HELEKQ^TLKNRLISAKQOLQTQGYRQTPYNNVQSRINTGRRK 

ANENAG LQECPRKG I KFQDAD VAETPHPMFTKYGNSLLEEARGE 

IRNLENVIQSOJRGQIEELEHIiAEILKTQLRRKENEIELSLLQLR 

EQOATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 

QRTUCISHDALMANGDELNMQLKEQRLKCCSLEKQLHSMKFSER 

RIEELQDRINDLEKERELLKBNYDKLYDSAFSAAHEEQWKLKEQ 

UJjjwyi. AyijK 1 AJUKS DLTDKTEI I^RLKTERD^ 

QLQYLEQKQQLDELKKRI KLYKQEND INADEIjSE ALLLIKAQKE 

QKNGDLSFLVKVDSEINKDLBRSMRELQATHAETVQEL3KTRNM 

LIMQHKINKDYQMBVEAVTRKMENLQQDYELKVEQYVHLLDIRA 

ARIHKIiEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 

ENLFE I HINKVTFS SE VLQASGDKEPVTFCTYAFYDFELQTTP V 

VRGLH PE YNFTSQ YLVHVNDL FLQ YIQKNTITLEVHQAYSTB YE 

TIAACQLKFHEILEKSGRIFCTASLIGTKGD I PNFGTVEYWFRb 

RVPMDQAIRLYRERAKALGYI TSNFKGPEHWQSLSQQAPKTAQL 

SSTDSTDGNLNELHITIRCCNHLQSRASHIiQPHPYWYKFFDFA 

DHDTAI I PSSNDPQFDDHMYFPVPMNMDLDR YLKSESIiS FYVFD 

DSDTQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

iCOXUUC Ut 

amino acid 
sequence 


Amino acid segment containing signal peptide -- 
(AaAlanine, OCysteine, D~Aspartic Acid, E=* 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L^Leucine, (^Methionine, N=Asparagine , 
P=Proline, Q«Glutaraine, R=Arginine, 
S=Serine, T»Threonine, V» Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, **=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








VIIiKWKFAYLPPSGSITTEDLGNFIRSEEPEWQRLPPASSVST 
LVLAPRPKPRQRLTPVD KKVS FVDI MPHQS D VSQEGS VD EVKEN 
TEKMQQGKDDVSLLSEGQLAEQSLASSEDETEITEDLEPEVEED 
MSASDSDDCI I PGPXSKNI KQPSBKI RIEI IALSLNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIYVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
G VAH VDLADM FQEGRDL I EQNIDVFDARADGEG IGKLRVTVEAI* 
HALQSVYKQYRDDLEA 


6346 


2321 


533 


QDRRLLRIjELQKTCQPTSTMSGSHTPACXjPFSAtiTPS IWPQE IL 
AKYTQKEES AEQ PE FY YDEFGFRVYKEEGDEPGS SLLANS PLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
Q I EKDLLRTMP SNACFASMGS IGVPRLRRVLRALAWLYPE IGYC 
QGTGMVAACLLIjFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASW 
DIKLLIiRlWDLFFYEGSRVLFQLTLGMLHTiKEEELIQSENSASI 
FNTLS D I PSQMEDAELLIiGVAMRLAGSLTDVAVETQRRKH LAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCSWSRQLPGLIj 
PNTALTPPTPLVGLYSLWQELTPDYSMBSHQRDHENYVACSRSH 
RRRAKALLDFBRHDDDELGFRKNDI ITIVSQKDEHCWVGELNGL 
RGWFPAKFVE VLDERS KEYS IAGDDSVTEGVTDIiVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLT PEELLYRAVQS VNVTHDAVHAQMDVKLRS L 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWS FLRS PGWVQIXC 
ELRVL C CFAFS LSQDWELPAKREAQQPLKEG VRDMLVKHHL FS W 
DVDG 


6347 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPF3ALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLIiANSPIjME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKIjRSLVIiA 

giphgmrpqlwmrlsgalqkkrnselsyreivknssndetiaak 
QIEKDLIiRTMPSNACFASMGS igvprlrrvlralawlypeigyc 
qgtgmvaaclllfleeedafwmmsai iedllpas yfsttllgvq 
tdqrvlrhlivqylprldkllqehdielslitlhwfltafasw 

DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEEIiIQSENSASI 

fntlsdipsqmedaelllgvamrlagsltdvavetqrrkhlayl 

IADQGQLLGAGTLTNIjSQWRRRTQRRKSTITALLFGEDDLEAL 

kaknikqtelvadlreailrvarhfqctdpkncswsrqlpgll 
pntaltpptplvglyslwqeltpdysmeshqrdhenyvacsrsh 
rrrakalldferhdddelgfrkndiitivsqkdehcmvgelngl 
rgwfpakfve vlderskeys iagdds vtegvtdlvrgtlcpalk 
alfehglkkpsllggachpwlfieeaagreverdfasvysrlvl 

CKTF11IJ3EDGKVLTPEELLYRAVQSVNVTHDAVHAQMD VKLRS L 
IWGLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRSPGWVQIKC 
ELRVLCCFAFS LSQDWEL PAKREAQQPLKSGVRDMLVKHHLFSW 
DVDG 


6348 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKiJMlliRNELQFKBE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQIAEGCRIiAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQK3SSPREMQKAEEKEVPEDSLEECAITCS 
NSKG PCDSNQ PHKNI KI TFEEDEVNSTLWDRES SHDECQDALN 
1 LP VPG PTSSATNVSMWS AGPIiSGEKAAINILE INE KLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQN KYKYEECKDL I KFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVI.VHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Kistidine, I=Isoleucine, K*I»ysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P-Proline, Q=Glutamine, R^Arginine, 
SsSerine, T=Threonine , V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEBVPQESWDEG 

YSTLS I PPEMLASYKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYIjELPDLGQPYSSAVYSLEE 

QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEWEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFyALBEKHVGFSLDVGBIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDE1EKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLBLTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRt^RELLDEKEPEVLQDSIiGRCYSTPSGYLELPDLGQ 

PYSSAVYSliEEQYDGLALDVDRIKKDQEEEEDQGPPCPRLSREL ' 

LEWEPEVIiQDSLDRCYSrPSSCLEQPDSCQPYGSSFYALEEKH 

VGFS LDVGE I E KKGKGKKRRGRRS KKE RRRGR K EGEEDQN P PCP 

RLNS MLMEVEEPEVLQDS LDI CYSTP SMYFE L PDS FQHYRS VF Y 

SFEEEHISFALYVDNRFFTLTVTSLHLVFQMGVI FPQ 


6349 


3 

• 


3^79 


AGAE KCFVTLLACFLAKQQNKYKYEE CKDL I KSMLRNELQ FKE E 

KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EIILQAIJjTPDEPDKSC^QDLQEQLAEGCRLAQHLVOKLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 

ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 

EKKQQ FRNLKEKCFLTQLACFLANQQNKYKYEECKDLI KFM LRN 

ERQFKEEKIAEQLKQAEELRQYKVLVH5QERELTQLREKLREGR 

DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 

LSPENDNDDDBDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSIiE 

ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEBVpQESWDEG 

YSTLS I PPEMIiASYKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHBATGPRLSREIiLDEKGPEVLQDSLDRCYSTPSGCIiELTDS 

CQP YRSAF YVLEQQRVGLAVNMDE I EKYQEVEEDQDPS CPRLS R 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEWEPEVIiQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALBEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDS LDRCYS TPSGCLBLTDS CQP YRSAFY IIiEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDBKEPEVLQESLDRCYS 

TPSGCLELTDSCQP YRSAFYI LEQQRVGLAVDMDE IE KYQEVE E 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

P YS S AVYSLEEQ YLGLALDVDR X KKDQEEEEDQG P PCPRLSREI* 

LEVVEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYAIjEEKH 

VG FSLDVGEI E KKG KGKKRRGRRSKKER RRGR KEGEEDQNPP C P 

RI^SMLMEVEBPEVLQDSI^ICYSTPSMYFELPDSFQHYRSVFY 

S FEEBH I S FAL YVDNRFFTLTVTSLHLVFQMGVI FPQ 


6350 


3 


3679 


AGAEKCFVTLLACFLAXQQNKYKYEECKDLlKSMLRNELQFKEE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTS satnvsmwsagplsgekaaini le INEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECIGDLI KFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DAS RSLNEHLQALLTPDEPDKS QGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
<A=Alanine, C-Cysteine, D«Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R*»Arginine, 
S^Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible' nucleotide deletion, 
\=possible nucleotide insertion) 








ECAXTCSNSHGPyDSNQPHRKTKlTFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEBEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHHWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQPYRSAPYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
yxUsliAL*DVDRIKKDQEEEEDQGPPCPRLSREIiLEWEPEVIiQD 
SliDRCYSTPSSCLEQPDSCQPYGSSFYALBEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGIiAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQD PS CPRLSRE LLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYS SAVYSLE EQ YLGLALDVDR I KKDQBEE EDQGPPCPRLSREL 
LE WE PE VLQDSLDRC YSTPS S CLEQPDS CQP YGS S FYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKBGEEDQNPPCP 
RLNSMLMEVEEPEVIiQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEE EH I S FALYVDNRFPTLTVTS LHLVFQMGVIFPQ 


6351 


129.1 

- 


319 


KKARRRTERSQLGRMLVVEVANGRSLVWGAEAVQALRERIiGVGG - 
RTVGALPRGPRQNSRLGLPtiLLMPEEARHiAE I GAVTIi VS APRP 
u&KiinbUAjjTSrKRO^EESFQEQSAIiAAEARETRRQEljLEKITE 
GQAAKKQKLEQASGASS SQEAGS S QAAKEDETS DGQASGEQEEA 
GPS S SQAGPSNGVAPLPRSALLVQIjATARPRP VKARPLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHYIAQCWAPEDTIPLQDLVAAGRLGTSVRKTLLLCSPO 
PDGKWYTSLQWASLQ 


6352 


235 


923 


WSEWIiSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP 
AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 
LMGNMNPEGGVNHENGMNRDGGMIPEGGGGNQEPRQQPQPPPES 

PAOAAMEGPOPKWMOPP T"P P T>I^PT»t t rwrmr edirpnnniAunw^ 

rn « w «av»rv^ miriufK i kk i KirrJjXiQVEELES VFRHTQYPDVP 

TRRELAENLGVTEDKVRVWFKNKRARCR{^QREIJ4IJ\NELRADP 
DDCVYIWD 


6353 


45 


672 * 


K t'AG AG A I PtARAR PPDVQAAEEEKEMDLPDS AS RVFCGRI LSM 
VNTDDVNAIILAQKNMLDRFECTNEMIJjNFNNLSSARIK^ 
FLHHTRTLVEMKRDLDS I FRRI RTIiKGKLARQHPEAFSH I PEAS 
FLEEEDEDPI PPSTTTT XATS EQS TGS CDTSPDTVS PS LS PG FE 
DLSHVQPGSPAINGRSQTDDEEMTGE j 


6354 
^355" ■ 


965 


510 


PSIiRPMEPTRDCPLFGGAF^SAiLPMGAIDVSDLRPVPDNQEVFC 
HPVTDQSLIVELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 
S VQPLS LENLALRG R CQEA WVLSG KQQ IAKENQQVAKD VTLHOA 
LLRLPQYQTDLLLTFNQPP 




158 


1662 


RGSSAAFRQSQLPflAMTgPvf nirr , tJir , T>ar t Tpgn<imnn^v^. u — 
auj^uu x\a*ijuijiu»/u»mkk v IjIr'nbMuKUiiLTRRPGTRRGGFSIiD 

WDGKVSElKKKIKSILPGRSCDLIiQDTSHLPPEHSDWIVGGGV 
IiGLSVAYNIjKKLESRRGAIRVLWERDHTYSQASTGLSVGGICQ 
QFSLPENIQLSLFSASFLRNINEYIAWDAPPLDLRFNPSGYLL 
IASEKDAAAMESNVKVQRQEGAKVSLMSPDQLRNKFPWINTEGV 
ALASYGMEDEGWFDPWCLLQGLRRKVQSLGVLFCQGEVTRFVSS 
SQRMLTTDDKAVVLKRIHEVHVKMDRSLEYQPVECAIVIHAAGA 
WSAQLAAIAGVGEGPPGTLQGTKLPVE PRKRYVYVWHCPQGPGXi 
ETPLVADTSGAYFRREGLGSNYLGGRSPrEQEEPDPANLEVDHD 
FFQDKVWpHIiALRVPAFETLKVQS AWAG YYDYNTFDQNGWG PH 
PLWNM YFATGFSGHGLQQAPGI GRAVAEMVLKGRFQT IDLS PF 
LFTRFYLGEKI QENNI I 


635(5 " 


354 


633 


TGbTSSCLPI^VMMTKRTKDMGKFSSVTVSTIDEEEEEIEAREV 
ADSYAQNAKVIEKQLERKGMSKRRLQELAELEAKKAKMKGTLID 
NQFK 
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SEQ 
ID 
NO: 


1 Predicted " ~ 
beginning 
nucleotide - 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D»Aspartic Acid, E=» 
Glutamic Acid, P-Phenylalanine, G=Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
L^Leucine, M«Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TeThreonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=UnJcnown, *«=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRNQTSISQWVPVCSRLIPVSPTQGQGDRALS 
RTS QWPQ MS QSQACGGSEQ IPG I D I QLNRKYHTTRKLSTTKDS P 
QPVEEKVGAFrKlIBAMGFTCPLKYSKWKIKIAALRMYTSCVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLVRMKQEGRSGKYM 
CRI I VHFMWEDVQQRGRVMGVNP YI LKKNM I LMTNHF YAAI LG Y 
DEG I LS DDHGLAAALWRT FFNRKCEDPRHLELLVE YVRKQ IQYL 
DS MNGEDLLLTGEVSWRPLVEKNPQS ILKPHSPTYNDEGL 


6358 


2009 


1040 


AS DALHSLSAP VLRLS S RS AARPATMTEQAI S FAKD FLAGG I AA 
AI SKTAVAP IE RVKLLLQVQHAS KQ I AADKQ YKGI VDCI VRI P K 
EQGVLS FWRGNLANVI RYFPTQALNFAFKDKYKQI FLGGVDKHT 
QF WR YFAGNLAS GGAAGATSLC FVYPLDFARTRLAADVGKSGTE 
REFRGLGDCX.VKITKSDGIRGLYQGFSVSVQGI II YRAAYFGVY 
DTAKGML P D P KNTH I WS WMIAQT VTAVAG WS Y PFDTVR RRMM 
MQSGRKGADIMYTGTVDCWRKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDEliKKVT 


6359 


98 


1086 


VCRQEEEKMKEDCLPSSHVPISDSKSIQKSELLGLLKTYNCYHE 
GKS FQLRHREEEGTL 1 1 EGLLN I AWGLRRP I RLQMQDDREQ VHL 
PSTS WMPRRPSCPLKEPS PQNGNITAQGPS IQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQR IRRHRFS 
I NGH FYNHKTS V FT PAYG S VTNVR VNS TMTTLQ VLTLL LN KFRV 
EDGP S E FALYI VHESGERTKLKDCB YPLISR I LHGPCE K I ARI F 
LMEADLGVEVPHEVAQYI KFEMPVLDSFVEKLKEEBERE I IKLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEE WIjP PRS CRVFW I HSGTTMS KVS FKI TLTSDP 
RLPYKVLSVPESTPFTAVLKFAABEFKVPAATSAI ITNDG IGIN 
PAQTAGNVFLKHGSELRI IPRDRVGSC i 


*361 


615 


158 


RPG LGQLQHCAliAPQAGNRRCRFHGRLHALTRSTHRGKPMS I HQ 
FKDTLNTPLPDSSPVAVPLGAPIAVASTLSVEHNDGVETGIWAC 
APGRWRRQI TSQE FCHF I QGRCTFT PDDGETLHI QAGDALML PA 
NSTG IWDIQETVRKTYVLIL 


6362 


350 


1575 


TTMDGSHSAALKLQQLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWS I ALNLQKYCH I RJjAG S KD P RA YFKT KTW W LGL FLMLLG 
ELGVFAS YAFAPLSLIVPLSAVSVI ASAI IGI IFIXEKWKPKDF 
LRRYVLSFVGCGLAWGTYLLVTFAPNSHEKMTGENVTRHLVSW 
PFLLYMLVEI ILFCLLLYFYKEKNANNIWILLLVALIX5SMTW 
T VKAVAG M LVLS I QGNLQ LD Y P I F YVM FVCMVATAVYQAA FLS Q 
ASQMYDSSLIASVGYILSTTIAITAGAIFYLDPIGEDVLHICMF 
ALGCL IAFLGVFL ITRNRKK? I PFE P YI SMDAMPGMQNMHDKGM 
TVQ PELKAS FS YGALENNDN I S EI YAPATLFVMQEEHGSRS ASG 


6363 


21 


1201 ■ 


RkTRt^SSF'PRRRDSSAMESYDVIANQPVVIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDI FIGP KAEEHRGLLS I 
RYPMBHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVUiTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPIYEGFAMPHS I MRIDIAGRDVSRFLRIiYLRJCEGY 
DFHSSSEFEIVKAIKERACYIiSINPQKDETLETBKAQYYLPDGS 
TIEIGPSR FRAPKLL FRPDL IG RES KG IHEVL VFAI Q KSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQB 
RliYSTWIGGSILASLDTFKKMWSKKEYEEDGARSIHRKTF 




21 


1201 


rrtrlgssfprrrdssamesydvianqpwidngsgvikAgfag 

DQ I PKYCFPNYVGRPKHVRVMAGALBGD IFIGPKAEEHRGLLS 1 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPR KNRERAAEVFFETFNVPALFISMQAVLS LYATGRTTGWIiD 
SGDGVTHAVPIYEGFAMPHS IMRIDIAGRDVSRFLRLYLRKEGY 
DFHSSSEFEIVKAI KERACYLS INPQKDETLETEKAQYYLPDGS 
TI E IG PS R FRAPBLLFRPDL IGEESEGIHE VLVFAI QKSDMDLR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=*Aspartic Acid, E* 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«sLysine, 
L=»Leucine, M=Methionine, N=*Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=*Serine, T=Threonine, V« Valine, 
"^Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /-possible nucleotide deletion, j 
\=possible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGS ILASLDTFKKMWVSKKBYBEDGARSIHRKTF 


6365 


234 


1999 


KHKSRASCAAi^QAFGPSREREVH$RFRSGLRRLGESWSGCCTM 
ASMGTLAFDEYGRPFLI I KDQDRKSRLMGLEALKSHIWAAKAVA 
NTMRTS IX3PNGLDKKMVDKDGDVT VTNDGAT ILSMMDVDHQ I AK 
W4VELSKSQDDE IGDGTTOVVVLAGAIiLEEAEQLIiDRG I HP IRI 
ADQYEQAARVAIEHLDKISDSVLVDIKDTEPLIQTAKTTLGSKV 
VNS CHRQMAE IAVNAVLT VADMERRD VDFEL IKVEGKVGGRLE D 
TKL I KG VI VDKDFSHPQM PKKVEDAKI AI LTC PFE PPKPKTKHK 
LDVTSVEDYKALQKYKKEKFEEM IQQI KETGANLAIOQWGFDDE 
ANHLLLQNNLPAVRWVGGPEIELIAIATX3GRIVPRFSELTAEKL 
GFAGLVQE I S FGTTKDKMLVIEQCKNS RAVTI F I RGGNKM 1 1 EE 
AKRS LHDALC VIRNLIRDNRWYGGGAAE IS CALAVSQEADKC P 
TLEQYAMRAFADALEVI PMALSENSGMNP IQTMTE VRARQVKEM 
NPALG IDCLHKGTNDMKQQHVIETIiIGKKQQI SLATQMVRMILK 
IDDIRKPGE5EE 


6366 


257 


1698 


GNKEGAHSSTFWVLLS I FLGAVAMLCKEQGITVLG LNAVFDl L V ' 
IGKFNVLE I VQKVLHKD KS LENLGMLRNGGLLFRMTLLTSGGAG 
MLYVRWRIMGTGPPAFTEVDNPAS FADSMLVRAVNYNYYYSLNA 
WLLLCPWWLCFDWSMGCIPLIKSISDWRVIALAALWFCLIGLIC 
QALCSEDGHKRR I LTLGLGFLVI P FLPASNLFFRVGFWAERVL 
YLPSVGYCVLLTFGFQALS KHTKKKK1.I AAWLG ILFINTLRCV 
LRSGEWRSEEQLFRSALSVCPliNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKY VHAMNNLGNILKERNELQEAEELIiS LAVQIQ 
PDFAAAWMNLGIVQNSLKRFEAAEQSYRTAIKHRRKYPDCYYNL 
GRL YAD LNRH VDALNAWRNATVLKPEKS LAWNNM 1 1 LLDNTGNL 
AQAEAVGREALELI PNDHSLMFSLANVLGKSQKYKESEALFLKA 
IKANPNAAS YHGNLAVL YHRWGHLDLAKKH YEI SLQLDPTASGT 
ICE N YGLLRRKLELMQ KKAV 


63 67 


287 


1934 


S I GFP VML VLS I LL YTC EM FQDSVAFEDVAVS F TQ EE WALLDPS 
QKNLYRDVMQETFXNLTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCESKESHHCGESFNQIADDMLNRKTLPGITPCESSVC3GEVGT 
IHlKALrlAjnKboc jOaiUENFiRNKECKKAFSYLDSFQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFLNLCLIHERIHTGVKPYKCKQCGXAFTRSTTLPVHER 
THOX^ADECKECXSNAFSFPSEIRRHKRSHTGEKPYECKQCGKV 
FISFSS IQYHKMTHTGEKPYECKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHBKTKTEDKPYGCKQCGKGFRCA 
SQLQIHERTHSGEKPHECKECGKVFKYFSSIiRIHBRTHTGEKPH 
ECKQCGKAFRYFSS LHIHERTHTGDKPYECKVOG KAFTCSSS IR 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


636B 


1 


327 


RPVPAKLNPRSWPRTAGALPLRP PPLTMAVFHDEVE I EDFQYDE 
DSETYFYPCPOGDNFSITKEDLENGEDVATCPSCSLIIKVIYDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVl'RTIHGSPREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 
FKNLTSVGKTWKVQNIBDEYKNPRRNLSLMREKLCESKESHHOG 
ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKAC7KEKPY 
DGKECTBTFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 
IHERIHTGVKP YKCKQ CGKAFTRSTTLP VMERTHTGVNADE CKE 
CGNAFSFPSEIRRHKRSHTGEKPYECKQCGKVFISFSSIQYHKM 
THTGEK P YE C KQCG KAFRCGSHLQKHGRTHTG EKP YE CRQ CGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q«Glutamine, R-Arginine, 
S=Serine, T«Threonine, V»Valine, 
W«Tryptophan, Y-Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S S LH I HERTHTGDK P YE CKVCGKAFTCS SSI RYHISrTHTGEKP Y 

ECKHCX3KAFISNYIRYHERTHTGEKPYQCKQCGKAFIRASSCRE 
HERTHTINR \ 


6*70 


1711 


329 


FVLSEQRLRTERTWPRSPGLGRGAAAAGARTAGAGLLRLIjLGCG 
AL VGGLR P VTMTTPANAQNAS KTWELSL YELHRTPQEAI MDGTE 
IAVSPR5LHSEI^CPICLDXLKirrMTTKBCLHRFCSDCIVTALR 
S GNKECPTCRKKL VS KRSLR PDPNFDALIS K I YPSR EE YEAHQD 

RVLIRLSRLHNQQALSSS ieeglrmqamhraqrvrrpipgsdqt 

TTMSGGEGE PGEGEGDGEDVSSDSAPDSAPG PAPKRPRGGGAGG 
S S VGTGGGGTGGVGGGAGS EDSGDRGGTLGGGTLGP PS P PGAPS 
P PE PGGE I ELVFR PHPLLVEKGEYCQTR YVKTTGNATVDHLSK Y 
! LALRIALERRQQQBAGEPGGPGGGASDTGGPDGCX3GEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSPKEKFMKC - 

LHNlWFENALCRKESKEYLECRMERiCLMLQEPLEKLGFGDLTSG 

KSEAXK 


6372 


2141 


62S 


RVSAIASEGKAEERYKKLEDLLEKS FSLVKMPSLQPWMCVMKH 
LPKVPEKKLKLVMADKELYRACAVEVRRQIWQDNQALFGDEVSP 
LLKQYILEKESALFSTELSVLHNFFS PSPKTRRQGEWQRLTRM 
VGKNVKIjYI3MVliQFLRTLFIiRTRNVHYCTLRAELLMSLHDLDVG 
E I CTVDPCHKFTWCLDAC IRERFVDS KRARELQG FLDG VKKGQE 
QVLGDLSMILCDPFAINTLALSTVRHLQELVGQETLPRDSPDIiL 
LLLRLLALGQGAWDKIDSQVFKEPKMEVELI TRFLPMLMS FLVD 
DYTFNVDQKLPAEEKAPVS YPNTLPES FTKFLQEQRMACEVGLY 
YVLHITKQPJnCNALLRLLPGLVETFGDIiAFGDI FLHLLTGNLAL 
LADEFALEDFCS SLFDGFFLTASPRKENVHRHALRLIiIHLHPRV 
AP S KLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPS PA 
QAAETPALBLPLPSVPAPAPL 


6373 


67 


711 


PSRAARASPARLPAMVSWIISRLWLlFGT^VPAYYSYKAVKSK 
DI KE YVKWMM YW 1 1 FAL FTTAET FTD IFLCW FPFYYELKIAFVA 
WLLS P YTKGSSliLYRKFVHPTLSSKEKE IDDCLVQAKDRS YDAL 
VH FG KRGLNVAATAA VMAAS KGQGALS ERLRS F SMQDLTT I RGD 
GAPAPSGPPPPGSGRASGKHGQPKMSRSASESASSSGTA 


6374 


S35 


2105 


HKLFCS YISTS EFP SSTRHHSCPTHTFCN YTSST I FLSSTRDHS 
CPTHTFCNYTSSTI FLSSTRDHSCPTHTSCNYTSSTI FLSSTRD 
HSCPTHTSCNYTSS TI FLS S TRDHS CPTHTFCNYPRP I IRLS SC 
CPAELQTEGSNGKKBVLSGFQWLEDTVLFPBGGGQPDDRGTIN 
DISVLRVTRRGEQADHFTQTPLDPGSQVIiVRVDWERRFDHMQQH 
S GQHL I TAVADH LFKLKT TS WE LGRFRS AX E LDT P S MTAEQ VAA 
IEQSVNEKIRDRLPVNVREIiSLDDPEVEQVSGRGLPDDHAGPIR 
WNIEGVDSNMCCGTHVSNLSDLQVI KILGTEKGKKNRTNLI FL 
SGNRVLKWMERSHGTEKALTALLKOGAEDHVEAVKK I iQNSTKI L 
QKNNLNIiLRELAVHIAHSLRNSPDWGGVVILHRKEGDSEFMNI I 
ANEIGSE3TLLFLTVGDEKGGGLFLLAGPPASVETLGPRVAEVL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKS 


5375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGIAAWSRT 
CPGR PRRPGQQ WRG PTMLVTAYLAFVGLLASCLGLELS RCRAK 
PPGRACSNPSFI^FQI^FYQVYFLALAADWLQAPYLYKLYQHYY 
FLEGQIAILYVCXJLASTVLFGLVASSLVDWUSRKKSCVIjFSLTY 
SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWIPATFARAAFWNHVLAVVAGVAAEAVASWIGLGPVAP 
FVAAI PLLAIAGALAIiRNWGENYDRQRAFSRTCAGGLRCLLSDR 
RVLLI^TXQALFESVIFIFVFLWTPVIiDPHGAPLGIIFSSFMAA 
SLU3SSLYRIATSKRYHLQPMHLLSLAVLIWFSLFMIiTFSTSP 
GQESPVESFIAFLLIEIACGLYFPSMSFLRRKVIPBTEQAGVLN 
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ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seyuksnt containing signal peptide 
(A^Alanine, C-Cysteine, D«Aspartic Acid, E=* 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R«Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








WFRVPLHSLACLGLLVLHDSDRKTGTRNMFSICSAVMVMALIAV 
VGLFTWRHDAELRVPSPTEEPYAPEL 


" 6376 


380 


1437 


itobiUJ.iJMYRFSFLWSKMPSK£SWSGRKTNftAAVHKSKQEGRQ 

QDLLI AALGMKLGS pkss vtiwqplklfaysqlts lvrratlkb 

NEQIPKYEKIHNFKVHTFRGPHWCBYCANFMWGL1AQGVKCADC 
GLNVHKQ CS KMVPNDCKPDLKHVKKVYS CDLTTLVKAHTTKRPM 
WDMC I RE I ESRGLNSEGLYRVSGFS DLIEDVKMAFDRDGEKAD 
ISVNMYEDINIITGALKLYFRDLPIPLITYDAYPKPIESAKIMD 
PDEQLETLHEALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLGI VFGPTLMRS PELDAMAALNDIRYQRLWELLI KNEDILF 


6377 " 


2311 


1844 


SRI RRRS SRR PRE PPGPSRRRRRRRPDPRTMPSE KTFKQRRTFE "~ 
QR VEDVRL I REQHPTKI PVI I ERYKGEKQLPVLDKTKFLVPDHV 
NMSEL IKI I RRRLQLNANQAFFLLVNGHSMVSVSTP ISEVYESE 
KDEDGFLYMVYASQETFGMKLSV 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARbPQYKRPpGRVGGGDSGRRNMAVA 
DLAIjIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAS YHAD X YDKVSGDMQKQGCDCE CLGGGR I SHQSQDKKI HV YG 
YSMAYGPAQHAI STE K I KAKYPD YE VTWANDGY 


6379 


35 


378 


eragspspsraalrrcapqrsqaprwpdraacrrsfqgsqgray 

L FNS WNVG CG P AE E RVLLTGLHAVAD I YCENC KTTLG W KYE HA 
FESSQKYKEGKYI IEIiAHMI KDNGWD 


• 6380 


1414 


462 


PAVQGQRGAGPP'iX»RGSGNMARS 1 AljTVVRHGETkENKEKI IQGQ 
GVDEPLSETGFKQAAAAGIFLNHVKFTHAFSSDLMRTKQTMHGI 
LERS KFCKDMT VKYDS RLRERK YGWEGKAL S ELRAMAKAAREB 
CPVFTPPGGETLDQVKMRGIDFFEFLCQLILKEADQKEQFSQGS 
PSNCLETSLAE I FPLGKNHSSKVNSDSGI PGLAAS VLWSHGAY 
MRSLFDY FLTDLKCSLPATLSRSELMS VTPNTGMSLFI INFEEG 
REVKPTVQC I CMNLQDHLNGLTENS LGLNLPS KSNHFEPLKGVP 
LAhrTS LliC 


6381 


1668 


218 


AWRAQGSRGFSGAeWRPRQAAAMNFSEVFKtSSLLCKFSPDGK 
VIiASCVQYRLVVRDVNTLQILQLYTCLDQIQHI EWS ADSLFILC 
AM YKRGLVQVWSLEQPEWHCKIDEGS AGLVASCWS PDGRH I LNT 
rEFHLRITVMSLCTKSVSYIKYPKACLQGITFTRDGRYMALAER 
RDCKDYVSIFVCSDWQLLRHFDTDTQDLTGIBWAPNGCVLAVWD 
TCLB YKI LLYS LDGRLLSTYSAYE WS LGI KS VAWS PSSQFLAVG 
S YDGKVR ILNHVTWKMITEFGHPAAIND PKI VVYKEAEKS PQLG 
LGCItSFPPPRAGAGPLPSSESKYEIASVPVSIiQTLKPVTDRAMP 
* «n Ln\r& fyb x SiAlKNDNI PNAVWVWDIQKLRLFAVIiEQL 
S P VRAFQWDPQQ PRLAICTGGSRLYLWSPAGCMS VQVPGEGDFA 
VLSLCWHLSGDSMALLSKDHFCt,CFLETEAWGTACRQLGGHT 


6382 ! 


2 


1062 


FEEDEDRNWTLIAYPLKGDHGIVDIVDEISDCEPKSKliIiRWTTNK " 
KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 
TVEKKGN ISSQLKHYNPWSMKCHQQQLQRMKENAKHRNQ YKF I L 
LENLTS RYEVPCVLDLKMGTRQHGDDASEEKAANQ IRKCQQSTS 
AVIGVRVCGMQVYQAGSGQLMFMNKYHGRKLSVQGFKEALFQFF 
HNGR YLRRELLG P VLKKLTELKAVLBRQES YRFYSSSLLVI YDG 
KERPEWLDSDAE DLEDLSEESADES AGAYAYKP I GAS S VD VRM 
I DFAHTTCRL YGEDTWHEGQDAG YI FGLQSLID I VTE ISEE SG 
E 


6383 


" 3159 


1061 


S PAPGRPS PHGSQPAARAAAAPAMPSAKQRGSKGGHGAASPSEK 
GAHPS AARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRS LAS Q P 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
PPPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SS SSSASAAAAAAAASSS ASCSRRLGRALNFLFYLALVAAAAFS 
GWCVHHVLEEVQQVRRSHQDFSRQREELGQGLQGVEQKVQSLQA 
TFGTFESILRSSQHKQDLTE KAVKQGESEVSR ISEVLQKLQNEI 
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Amxno acid segment containing signal peptide"" 
<A«Alanine, C«Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, OsGlutaraine, R=Arginine, 
S^Serine, T«Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, XoUnknown, *»Stop 
Codon, /opossible nucleotide deletion, 
\«possible nucleotide insertion) 








LKDLSDGI HWKDARSRDFTSLENTVEERLTKLTkS tNDNI AI P 
TEVQKRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSRE WDMEALRS TLQTMESD I YTBVREL VS LXQBQQAFKEAAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEELRQLKSDSHGP 
KEDGGFRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
ESLESIJjSKSQEHEQRLAALQGRLEGLGSSEADQDGIjASTVRSL 
GETQLVLYGDVEEIjKRSVGEIjPSTVESLQKVQEQVHTLLSQDQA 
QAARLPPQDFLDRLS S LDNLKAS VSQ VEADLKMLRTAVDSLVAY 
SVKIETNENNI/ESAKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 
6385 " ~ 


738 


1904 


iwevpvclthllhlqqanqplpppsssineedadeanraigekr 

AAPDSG KKP KTPKTKQQKDPNE PQKPVS AYALFFRDTQAAI KGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVS KAAAESAEAQTIRS VQQTLAS TNLTS SLLLNTPLSQ 
HGTVSASPQTLO^SLPRSIAPKPLTMRLPMNQIVTSVTIAANMP 
SN1 GAPLISS^TTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ 
Q0QMQQMQQQQIX3QHQMHQQIQ(^MQQQHFQHHMQQHLQQQQQH 

lqqqinqqqlqqqlqqrlqlqqlqhmqhqsqpsprqhspvasqi 
tspipaigspqpasqqhqsqiqsqtqtqvlsqvsif 




2 


1584 


PRVRA7U3VAAGAQAWSAQMAKSNGENGPRAPAAGESLSGTRES 

laqgpdaattdelsslgsdseangfaerridkfgfivgsogaeg 
aleevplbviirqreskwldmlnnwdkwmakkhkkiriircqkcil p 
PSLRGRAWQ ylsggkvxlqqnpgkfdeldmspgdp kwldvi erd 
lhrqfpfhemfvsrgghgqqdlfrvlkaytlyrpebgycqaqap 
iaaviilmhmpaeqafwclvqicekylpgyysekleaiqldgeil 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RWDMFFCEGVKIIFRVGLVLLKHAI^SPEKVKACOGQYETIER 
LRSLS PKI MQEAFLVQEWELPVTERQI EREHLIQLRRWQETRG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPS I RLPXjDAPLPGS 
KAKPKPPKQAQKEQRKQMKGRGQLEKPPAPNQAMWAAAGDACP 
PQHVP PKDSAPKDSAPQDLAPQVSAHHRSQESLTSQESEDTYL 


6386 


819 


195 


' T VCGS F YLG IMQRASRLKREiiriMliATEPPPGI TCWQPKDQMDDL 
RAQ I LGGANTP YEKGVFKLEVI IPERYPFEPPQIRFLTPIYHPN 
IDSAGRICLDVLKLPPKGAWRPSLNIATVLTSIQLLMSEPNPDD 
PLMADISSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMLDNL 
PBAGDS RVHNS TQKRKASQLVGI BKKFHPDV 


6387" 


1 


662 


PG P THAS ADA WADA WAQ PNMAMHNKAAP PQ I PDTRRELAELVKR 
KQELAETLANLERQrYAFEGS YLEDTQMYGNI IRGWDRYLTNQK 
NSNS K3TORRNR KFKEAERL FSKS S VTS AAAVSALAG VQDQLI EK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6383 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KU-ttu/us l IjANIiERQIYAFEGS YLEDTQMYGNI IRGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6385 


1074 


497 


akpgurmaghrlvlvlgdLhiphrcnslpaxfkkllvpgkiqhi 

LCIXINLCTKESYDYLKTLAGDVHIWGDFDENLNYPEQKVVTVG 
QFKIGLIHGHQVIPWGDMASLALIiQRQFDVDILISGHTHKFEAF 

ehenkfyinpgsatgaynaletni ipsfvlmdiqastwtyvyq 

LIGDDVKVERIEYKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELtitPNWQGSGSHG 
LTIAQRDDGVFVQEVTQNSPAARTGWKEGDQIVGATIYFDNIiQ 

sgevtqllntmghhtvglklhrkgdrffpslgqtwdp 


6391 


5386" 


2897 


VRWNSKTECYLSiQTQENFPANLNEIiVNCIVXSSIiVTTQRKLkA 

msllgsrnqlaravlnpnpmdfctkdlltttseri iaylrdfne 
dqkkaietayamvkhspsvakiclihgppgtgksktivgllyrl 
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Amino acid segment containing signal peptldT" 
CA=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F-Phenylalanine, G*Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
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S^Serine, T=Threonine, V^Valine, 
W^Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


" £392 






LTeNQRKGHSDBNSNAKIKQNkVLVCAPSNAAVDELMKKIILEF 
KEKCKDKKNPLGNCGDINLVRLGPBKSINSEVLKFSU)SQVNHR 
MKKELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DEN IS KVS KERQELAS KI KEVQG RPQKTQS 1 1 1 LESH I ICCTLS 
TSGGLLLESAFRGOGGVPPSCVrvnp^r'rkcr'cTorpT «*n T Timn« 

KLILVGDPKQLPPTVlSMKAQEYGYDQSMMARFCRIiLEENVEHN 
M I SRLP I LQLTVQYRMHPD I CL PPSN YVYNRN tiKTNRQTEAI RC 
SSDWPPQPYLVFDVGDGSERRDNDSYINVQEIKLVMEIIKLIKD 
KRKDVS FRNIGI ITH YKAQKTM IQKDLDKBFDRKGPAEVDTVDA 
FQGRQKDCVIVTCVRANSIQGSIGPIoASLQRIiNVTITRAKYSLP 

I LGHLRTTiMF WQHWWOT i TOHB O KP A T T WnrunmrTuvn-mrrfr-r w 1 

LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGPAKTSVAASL 
YHTPSDSKEITLTVTSKDPERPPVHDQLQDPRLLKRMGIEVKGG 
I FLWDPQPSSPQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LS SHKPPVRGE P PAAS PEASTCQS KCDDP EEELCHRREARAFSE 
GEQEKCOSETHHTRRNSRWDKRTLEQEDSSS KKRKLL | 


" *393 


972 


J 186 


grtgvdlassmahrlqirlltwdvkdtllrlrhplgeayXtkar 

AHGLEVEPSALEQGFRQAYRAQSHSPPNYGLSHGLTSRQWWLDV 
VLO/rFHbAGVQDAQAVAPIAEQLYKDPSHPCTWQVLDGAEDTLR 
E CR TRGLRLA VI SNFDRRLEG I LGOhGLRBHFDFVLTS EAAGWP 

KPDPRIFOEAI.STAWMPIJVTTlMltrtrrirwbTVfT sirw/v/N'rvr* 

M^rftAryaftwuirtnniii 1 v VAAn VGDNYLCDYQGPRAVGMHSFL 
WGPQALDPWRDSVPKEHILPSLAHLIiPALDCLEGSTPGL 




2017 


730 


TC5GS KMAAYATCGS VAASTGSAVATAS KSNVTS FQRRGPRASVtH 
ND SGPRLVS IAGTRPS VRNGQLIiVS TGLPALDQLLGGGIjAVGT V 
LLIEEDKYNIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILQ 
ELPAPLLDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 
IGP VSSSRFGH YYDASKRMPQEL I EASNWHG FFLPBKI S STLKV 

* r>J - 1 AXVl ^W'As^J.xxisisurDQSNPQKKQRNILRIGIQN 1 
LGSPLWGDDICCAENGGNSHSLTKFLYVLRGLLRTSLSACI ITM 
PTHLIQNKAIIARVTTLSDVWGLESFIGSERETNPLYKDYHGI, 
IHIRQIPRLNNLICDESDVKDIAFKLKRKLFTIERLHLPPDLSD 
TVSR3 S KI^LAESAKRLGPGCGMMAGGKIQILDF ! 


6394 
6395 " 


1418 


511 


GAAAGGl^ARRRPAAMAl 1 VMAA^AXfeRA^/T'.'ftj Ppoj r t tmprmw 
VLKQLQDI LKEASLRFTIiPGSGTEGPAKQENFI LGSCGTDQVKG 
VLT LQGDALSQ ADVNL KM PRNNQLLH FAFREDXQ WKLQQ I QDAR 
NHVSQAI YLLTSRDQS YQ PKTGaBVLKIjMDAVMLQI»TRARNRLT 
TPATLTLPEIAASGLTRMFAPALPSDIjLVNVYINLNKLCLTVYQ 

lhalqpnstknfrpaggavlhspgamfewgsqrlevshvhkvec 
vipwlndalvyftvslqlcqqlkdkisvfssywsyrpf 1 




~ i5 r 


658 


PSGRPTRPLCCAARRGAA^(jGSVSGWPaGRTP1^ETSNPGSSVM 1 

esvtfedvavefiqewalldsarrslckyrmldqcrtlasrgtp 
pckpscvsqlgqraepkatergilratgvawesqlkpeelpsmq 

DLLEEASSRDMQMGPGLFLRMQLVPSIEERETPLTREDppat ot? 

ppwslgctglkaamqiqrwi pvptlghrnpwvardsge 


6396 
6397 


1 


1221 


AN ILS S PS KRGO KGTJj IG YS PEGTPLYNFMGDAFQ HS S QS 1 PRF 
I KESLKQILEESDSRQI FYFLCLNLLFTFVEIiFYGVLTNSLGLl 
SDGFHP1LFDCSALVMGLFAALMSRWKATRIFSYGYGRIEILSGF 
INGLFLIVIAFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
IGI CAFSHAHSHAHGASQGS CHS SDHS H S HHMHGHS DHGHGHSH 
GSAGGGMNANMRGVFLHVLADTIiGSIGVIVSTVLIEQFGWFIAD 
PLCSLFIAILIFLSWPLIKDACQVU.LRLPPEYEKELHIALEK 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV 
TG ILKDAG VNNLTIQVE KEAYFQHMSGLS TGFHDVLAMTKQMES 
WKYCKDGTYIM 




391 


122 


3AGGVGR FEAl RAPARMI E WCWDRLGKKVRVKCNTDDTIGDLK 1 
KLIAAQTGTRWNKIVLKKW YT I FKDHVSLGDYE IHDGMNLELYY | 
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Amino acia segment containing signal peptlHT" 
(A=Alanine. C«Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, Phenylalanine , G*Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N=Asparagine , 
P=>Proline, Q=Glutamine , R=«Arginine, 
S=Se rine, T=Threonine, VoValine, 
W^Tryptophan, YeTyroaine, X=tJnknown, *«Stop 
Codon, /=»possible nucleotide deletion, 
\=possible nucleotide insertion) 
Q 


"6398 


353 


1306 


HKQMGPLINkC!OCILIiPTTVPPATMRIWLIjGGLLPFI>LL£SgLQ 
RPTEGSEVAIKIDPDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
D 1 EAQKNYFRMWQKAHIAWLNQGKVLPQNMTTTHAVAI I/FYTIJT 
SNVH SDFTRAMAS VARTPQQ YERS PHFKYLHY YLTS A I QLLRKD 
S I MENGTL C YE VHYRTKDVHFNAYTGATIR FGQ FLSTS LLKEEA 
QEFGNQTLFTI FTCLGAPVOYF^T.K"Tf pvi\tdt>vt?t cvtfTi>iMovn 

PRGDWLQLRSTGNIiSTYNCQLLKASSKKClPDPIAIASLSFLTS 
VIIFSKSRV 


6399 


75 


1245 


PNIiETYFGRRCEKDSMNFTPTHTPVCRKRIWSKRGVAVSGPTK 
RRGMADSLESTPLPSPEDRLAKLHPSKELLEYYQKKMAECEAEN 
EDLLKKLELYKEACEGQHKLBCDLQQREEEIAELQKAIiSDMQVC 
LFQEREH VLRL YSENDRLRI RELEDKKK IQNLLALVGTDAGE VT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RR IHLEE IQVQHQRNQNKI KELTKNLHHTQELLYES TKDFLQLR 
SENQNK^KSWMLEKI)NLMSKIKQYRVQCKKKEDKIGKVLPVMHE 
SHHAQSBYI KVMSLCRNEWYFSGRVEGIPKNLQFVM 


6400 


2520 


1053 


DNIS VTFLS LTDLQ KNETLDIILI SLSGAVQIiRHLSNNLETLLKR 
DFLKLLPLELSFYLLKWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGWQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFETS 
S LIGHS ARVYALY YKDGLLCTGS DDLSAKLWDVSTGQC VYG I Q/T 
HTCAAVKFDEOKLVTY3J5 FTINTVArWEWQQr* ADnv-ttjetTj^tTrrvn inre» 

SVDYNDELDILVSGSADFTVKVWALSAGTCLNTIiTGHTEWVTKV 
VLQKCKVKSLLHSPGDYrLLSADKYEIKIWPIGREINCKCLKTL 
S VSEDRS I CLQPRLHFDGKYI VCSSALGLYQWDFAS YDILRVIK 
TPE IAHLALLGFGD I FAIjLFDNRYLY IMDLRTBSL I SRWPLPE Y 
RKS KRGSSFLAGEAS WLNGLDGHNDTGIiVFATSMPDHS IHLVIiW 


6401 


109 


766 


PGAAWSRPDLRGCCTGPQPALRMLVL PS P CPftPLAFS S VETMEG 
PPRRTCRSPEPGPSSSIGSPOASSPPRPNHYLLIDTCGVPYTVI* 
VDEESQREPGASGAPGQKKCYSCPVCSRVFEYMSYLQRHSITHS 
EVKPFECDI CGKAFKRASHLARHHS IHLAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


6402 


T195 


279 


TTSQCGGI RQSSAI PVAStffiFAAICLRNALLLLPEEOXJDPKQEN 
GAKNS NQLGGNTES SESS ETCSS KSHDGDKF I PAPPSS PLRKQE 
LENLKCS I IACSAYVALAIjGDNLMALNHADKLLQQPKLSGSLKF 
LGHL YAAEALIS LDR I S DAI THLNPENVTD VS LGIS SNEQDQGS 
DKGENEAMESSGKRAPQCYPSSVNSARTVNLFNLGSAYCLRSEY 
DKARKCLHQAASMIHPKEVPPEAILLAVYLELQNGNTQLALQI I 
KRNQLLPAVKTHSEVRKKPVPQPVHPIQPIQMPAFTTVQRK 


6403 
6404 


2 

1012 


1690 

22} ■ - 


rgihtsvu^nlqnqmyshnvvimnlnnlnltqvqqrnlitnlq 
rsvddtsqaiqrikndfqnlqqvfloakkdtdwlkekvqslqtl 
aann salakanndtledmnsqlns ftgqmeni ttisqaneqnlk 

DLQDLHKDAENRTAIKFNQLEERFQLFETD I VNI ISNIS YTAHH 
LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTJLANIRLDSVSLR 
MQQDLMRSRLDTEVANLSVIMEEMKLVDSKHGQLIKNFTILQGP 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPQPPGPAGBRGPIG 
PAGPPGERGGKGS KGSQG PKGSRGS PGKPGPQGPSGDPGP PG PP 
GKEGLPGPQGP PGFQGLQGTVGEPGVPG PRGLPGLPGVPGMPGP 
KGPPGPPGPSGAWPIiALQNEPTPAPEDNSCPPHWKNFTDKCYY 
FS VE KEI FEDAKLFCEDKS SHLVFINTREEQQWI KKQMVGRESH 
W IGLTDSERENE WKWLDGTS PD YKNWKAGQPDNWGHGHGPGED C 
AGL I YAGQWNDFQCEDVNNFICEKDRETVLS SAL 
kAALAMAAPAPGLlSVFSSSQEIX3AAIiAQLVAQRA^ j 
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Amino acid Begment containing signal peptide " 
(AoAlanine, OCysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F»Phenylalanine , G»Glycine, 
n°niai.iuine, i. a ±soieucine , K=Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=*Arginine, 
SaSerine, TVThreonine , V-Valine, 
W=Tryptophan, Y=Tyrooine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RFALGLSGGSLVSMLARBLPAAVAPAGPASLARWTIiGFCDERE^" 
P FDHAEST YGLYRTHLLSRLP I PE SQ V ITIN PELPVE EAAED YA 
KKLRQAFQ GDS I P VFDLLI LG VGPDGHTCS L FPDHP LLQERE K I 
VAPIS DS PKP PPQRVTLTLPVLNAARTVI F VATGBG KAAVLKR I 
LEDQEBNPLPAALVQPHTGKLCWFLDEAAARLLTVPFEKHSPL 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAVATLRNHRPR 
TAQRAAAQVLGSSGLFNNHGLOVQQQQQRNLSLHEYMSMELLQE 
AGVS VPKG YVAKS PDEAYAIAKKLGSKD WT KAQVLAGGRG KGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLPTKQTGEKGRICNQ 
VLVCERiCYPRREYY PAITMERS FQGPVLIGS SHGGVN I EDVAAE 
TPEAIIKEPIDIEEGIKKEQALQLAQKMGFPPNIVESAAENKVK 
LYSLFLKYDATMIEINPMVEDSDGAVLCMDAKINFDSNSAYRQK 
KIFDLQDWTQEDERDKDAAKANLNYIGLDGNIGCLVNGAGLAMA 
TMDIIKLHGGTPANFLDVGGGATVHQVTEAFKLITSDKKVLA1L 
VNIFGGIMRCDVIAQGIVMAVKDLEIKI PVWRLQGTRVDDAKA 
L IADSGLKI IiACDDLDEAARMW KLSE I VTLAKQAHVD VKFQLP 


6406 


1036 


167 


HPRQMRGEDTPEAPP YSSGR YDS I XTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDSEGMDPERLKAFNMFVRLFVDENLDRM 
VPISKQPKJSKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTR PTP PHIiTSAMAENT LAAACESETRKAAKRMRLE I YQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGESGEARALASRPAPSWV 


6407 


492 


150 


VGL(^VSQTVl^JaDALLVFP(30VA0LSCrLSPQHWIRDYGV 
S W YQQRAGSAPR YLLYYRSEEDHHRPADI PDRFS AAKDEAHNAC 
VLTI S PVQPEDDADYYC3VGYGFS P 


6408 


1458 


903 


1 1 aay/iWKiit\iQiVTRGFNMKI EKCYFCSGPI YPGHGMMFVR 
NDCKVFRFCKSKCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
S FB FEKRRNEPI KYQRELWNKT I DAMKRVEB I KQKRQAK FIMNR 

LKiCNKELQKVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


1515 


446 


NTALANLLRCFTCDR LCGGCTAPAPPAHQGI VLQ P VM PS CDPGP 
GPACLPTKTFRS YLPRCHRTYSCVHCRAHLAKHDELI SKS FQGS 
HGRAYLFNSV 


6410 


85 


607 


RGGTAGCVACLGCWGQSSSPKAAFPAGSACLPADSCPCLLFQAC 
AISGLFNCITIHPLNIAAGVWMIMNAFILLLCEAPFCCQFIEFA 
NTVAEKVDRLRS WQKAVF YCGMAWP I VI SLTLTTLLGNAI AFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKXAETLEGEL 


6411 


302 


772 


RLS IMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM " 
GAG IAVLFKKKFGGVQELIiNQQKKSGEVAVLKRDGRYI YYLI TK 
KRAS HK PTYENLQKS LEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVSAM I EEVFEATD I K I T VYTL 


" 6412 ' 


61 


1709 


RPVTSFSPLPGSCGGRLGTRTMLGRSLREVSAALKQGQITPTEL 
CQKCLSLIKKTKFLNAYITVSEEVALKQAEESEKRYKNGQSLGD 
LDGI P I AVKDNFSTSGIETTCASNMLKGYIPPYNATWQKLLDQ 
GALLMGKTNLDEFAMGSGSTDGVFGPVKNPWSYSKQYREKRKQN 
PHS ENEDS DWL I TGGS SGGSAAAVSAFTCYAAIiGSDTGGSTRN P 
AAHCGLVGFKPS YGLVSRHGL I PLVNS MD VPGI LTRCVDDAAI V 
IX3ALAGPDPRDSTTVHEPINKPFMLPS LADVS KLCIGI PKEYLV 
PELSS E VQSLWS KAADLFESEGAKVTEVSLPHTS YS I VC YHVI*C 
TSE VASNMARFDGLQ YGHRCDIDVSTEAM YAATRREG FNDWRG 
RILSGNFFliLKENYENYFVKAQKVRRLIANDFVNAFNSGVDVLIi 
TPTTLSEAVPYLE FI KEDNRTRSAQDDIFTQAVNMAGLPAVS I p 

VALSNQGLPIGIiQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
LMDDCSAVLENEKLASVSLKQ 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=>Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, KaLysine, 
LaLeucine, M*Methionine, N=Asparagine , 
- - ri wAi»e, w-v»xucannne, R*=Arginine, 
S=Serine, T«Threonine, V^Valine, 
^Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=?ossible nucleotide insertion) 


6413 


2 


885 " 


tiatr c^^wuuvtLtvtnijuu&e X PlUbNi* 1 JSKAFATMGETVMSVKI IR 
NRLTG I PAG YCFVE FADLATABKCLH KINGKPLPGATPAKRFKL 
NYATYGKQPDNS PE YS LFVGDLTPDVDDGML YE FF VKVYPSCRG 
GKWLDQTGVS KGYGFVKPTDELEQKRALTEOQGAVGLGS KPVR 
LS VAI P KAS R VKPVE YSQM YS YS YNQYYQQYQN YYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQ S EELYDALMDCHWQ PLDT VS S EX FAMM 


6414 


1 1 


538 


RGGRAALLPWRRFP CCRPK PQPAR PS SRATPGPRS PGMATS IGV " 
SFSVGDGVPEAEKNAGEPENTYILRPVFQQRFRPSVVKDCIHAV 
LKEELANAE YS PEEMPQLTKHLSENI KDKLKEMGFDRYKMWQV 

VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCWAAFGC 
FYY 


6415 
" 6416 


2 


1168 


FVRQWQS SHRRACGLGCEARAGGGEEPRGRASS VAG WVGAFRAP 
F I EAAVAGLGAGSGKRRRGWKMP VHSRGDKKETNHHDEME VDYA 
ENEG S SS EDEDTES S S VSEDGDS S2 4 MDDBDCERRRMECLDEMSN 
LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQ I RTKVAG I YRELCLES VKNKYECE I QAS RQHCES E KLLLYD 
TVQSELE HKIRRLEEDRHSIDITS ELWNDELQSRKKRKDPFWPD 
KKKPGWS GPYI VYMLQDLDI LED WTTI RKAMATLGPHRVKTE P 
PVKLEKHIiHSARSEEGRIiYYDGEWYIRGQTI CIDKKDECPTSAV 
ITTINHDE VWFKRPDGSKSKLYISQLQKGKYS I KHS 




410 


1513 


EI APADLE I PACAP VLLS RATS STMS VTGGKMAPSLTQfe thSKL 
GLAS KTAAWGTLGTLRTFLNFS VDKDAQRLLRAITGQGVDRSAI 
VDVLTNRSREQRQLI SRNFQERTQQDLMKS LQAALSGNLERI VM 
AI*LQ?TAQFDAQELRTALKASDSAVDVAIEILATRTPPQI^2ECL 
AVYKHNFQVE AVDG I TS ETSG ILQDLLLAIAKGGRDS YS GI IDY 
NLAEQDVQALQRAEGPSREETWVPVPTQRNPEHIjIRVFDQYQRS 
TGQELEEAVQNRFHGDAQVALLGLASVTKNTPLYFADKLHQALQ 
ETEPNYQVLIRILISRCETDLLSIRABFRKKFGKSLYSSLQDAV 

kgd cqs allalcrabdm 


6417 


1 ! 


845 


Ki^t^KVljWbisijJS^iiA^UAUUWASSIJa^ARMDNRFATAPVIACVLS 
LISTIYMAASIGTDFWYEYRSPVQENSSDLNKSIWDEFISDEAD 
EKTYNDALFRYNGTVGLWRRCITIPKNMHWYSPPERTESFDWT 
KCVSFTLTEQFMEKFVDPGKHNSGIDLLRTYLWRCQFLLPFVSL 
GIiMCFGAL I GLCACI CRSL YPT IATG I LHLLAGLCTLGSVS CYV 
AGIELLHQKLELPDNVSGEFGWSFCLACVSAPLQFMASALFIWA 
AHTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGliGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 
rArr ^^^^^vj^iALJiybPAKffYGYDNLQRQPI FTTQQEAELVQ 
YFDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQTLELEKEFIjFNPYLTRKRRIEVSHALALTERQVKIWFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAQELEEDRAEGLTN 


6419 J 


1 


973 


PGRPRVRNFDLNSKS ILQEFFCTRS IQ I PANRS#TAM£kCP I FP 
MARSISTSGPIJJKEDTGRQKLISTGSLPATLQGATDSLGLEWHr* 
PS PDP VTVP YLS PLWWKELESLLBNEGDHAI T VAD FVDHHP I V 
FWNLVWYFRRLDLPSNLPGL ILSSEHCN KYSKI PRHGMSEDS KY 
VL I QMLWDNMKLHQDPGQPLYILWNAHTQKYPMVHLLQKSDNS F 
NQE LLKSMVKS I KMNDVYG PMS Q I LETLNKCPHFKRQ RS LYRE I 
LFL S L VALGREN I D IDAFDKE YKMAYDRLTPSQVKSTHNCDRPP 
STGVMECRKTFGEPYL 


6420 " 


207 


1187 


rkmidknqtcgvgqdsvpymiclihIleeWfgVE^Ledylnfan 

YLLWVFTPLILLILPYFTIFLLYiiTI IFLHI YKRKNVLKEAYSH 

nlwdgarktvatlwdghaavwhgyevhgmekipedgpali IFYH 

GAIPIDFYYFMAKIFIHKGRTCRWADHFVFKIPGFSLLLDVFC 

alhgprekcveilrsghllaispggvrealisdetynivwghrr 
gfaq vaidakvp 1 1 pmftqniregfrslggtrlfrwlyekfryp 
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SECT- 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


111x110 aciq segment containing signal peptide 
(AsAlanine, C-Cysteine, D=Aspartic Acid, Eb 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HtHistidine, I=Isoleucine, K-Lyslne, 
L=Leucine, M=Methionine, N**Asparagine, 
P«Proline, QeGlutaraine, R«Arginine, 
S=Serine, T=Threonine, V*=Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *r=stop 
Codon, /-possible nucleotide deletion, 
\«=possible nucleotide insertion) 








FAPMYGG F P VKLRT YLGDP I P YDPQI TAEELAEKTKNAVQaL I D 
KHQRIPGNIMSALLERFH 


6421 


■ 1844 


362 


w/uj^ijJo<y^JiKMfeWKblrfbPHPk^vviiKSBFKMASSPAVLRASRL 
YQWS LKS S AQ FLGS PQLRQVGQI IRVPARMAATLI LE PAGRCCW 
DEPVRI AVRGLAPEQPVTLRAS LRDEKGALFQAHARYRADTLGE 
LDLERAPALGGSFAGLEPMGLLWALEPEKPLVRIiVKRDVRTPLA 
VELE VLDGHDPDPG RLLCQTRHER YFLPPGVRRE P VRVGRVRGT 
LFLPPEPGPPPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
ELCLSMASFLKGITAAWINGSVANVGGTLRYKGETljPPVGVNR 
NRI KVTKDGYADI VDVIiNS PLEGPDQKS FI PVERAESTFLFLVG 
QDDHNWKSEFYANBACKRLQAHGRRKPQI ICYPETGHYIEPPYF 
PLCRAS LHALVGS P 1 1 WGGE PRAHAMAQVDAWKQLQT FFHKHLG 
GREGTIPSKV 


6422 


181 


2133 


EGENliSWFQEFWGDIAKEFyWJ^TPCPGPFIiRVNFDV^KGKIFIE 
WM KGATTN I CYNVLDRNVHEKKLGDKVAFYWEGNE PGETTQI T Y 
HQLliVQVCQF SNVLRKOX31 HKGDRVAI YMPMIPELVVAMLACAR 
IGALHSIVFAGFSSESLCERILDSSCSliLITTDAFYRGEKLVNI, 
KELADEAI^KCQEKGFPVRCCIVVKHLGRAEIiGMGDSTSQSPPI 
KRS CPDVQ1 S WNQG IDLWWHELMQEAGDECE PE WCDAEDPLFIL 
YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
IGWI TGHS YVTYGPLANGATSVLFEGI PTYPDVNRLWS I VDKYK 
VTKFYTAPTAI RbLMKFGDEPVTKHSRASLQ VLGT VGE PINPEA 
WLWYHRVVGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 
FPFFGVAPAILNESGEELEGEAEGYIfVFKQPWPGIMRTVYGNHE 
RFETTYFKKFPGYYVTGDGCQRDQDGYYWITGRIDDMIiNVSGHL 
LS TAEVESALVEHEAVAEAAVVGHPHPVKGECL YCFVTLCDGHT 
FSPKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
LRKIAQNDHDLGDMSTVADPS VI SHLFS HRCLTIO 


6423 


614 


1237 


/WI/KEIPRDLPPETVIJiYLDSNQITSjp^IFKDLHQI^VI^- 
KNG I E F IDEHAFKGVAETLQTIiDLSDNR IQS VHKNAFNNLKARA 
J- akjm f wrtt^o I IjUU V LKSMASNHBTAHNVI CKTS VLDEHAGR P 
FLNAANDADLCNLPKKTTD YAMLVTMFGWFTMVI S YWY YVRQN 
QEDARRHLEYLKSLPSRQKKADEPDDISTW 


6424 


1 


1188 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWWAHNRMTLKQLKDRYPRKRFVAGVGANSKISA 
LKGS KG KD WEI PVP VGI SVTDENGKI I GELNKENDR I LVAQGGL 
GGKLLTNFLPLKGO!QRTTWT,nT.lifT.T&TO7r'Tirr«i7nM»i-*vr*oT T 
VSHAKPAIAD YAFTTLKPELGKIMYS DFBOQIS VADL PGL I EGAH 
MNKGMGHKFLKHIERTRQIjLFWDIS GFQLSSHTQYRTAFETI I 
LLTKELELYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNM I PERTVEFQHI IP ISAVTGEGIEELKNCI RKSL 

DEOANOENDAT.HIfK'nTiTiNT.WT CTYTMC e»m?T5«ae? vtrxtinvnnvunf 


6425 
6426 


1850 


1144 


IiAMEGGGGIPLETXiKEESQSRHVLPASFEVNSIKJKSNWGFLLTG 
LVGGTLVAVYAVATPFVTPALRKVCLPFVPATMKQ I ENWKMLR 
CRRGSLVDIGSGDGRIVIAAAKKGFTAVGYELNPWl,VWYSRYRA 
WREGVHGSAKFYI SDLWKVTFSQYSNWI FGVPQMMLQLEKKLE 

RELEDDARVIACRFPFPHWTPDHVTGEGIDTVWAYDASTFRGRE 
KRPCTSMHFQLP IQA 




30 


565 


SRGAAVGGMS VAGGEI RGDTGGEDTAAPGRFS FS PE PTLED I RR — 
LHAE FAAERDWEQFHQPRNLLLAI»VGE VGELAELFQWKTDGEPG 
PQGWSPRBRAALQEELSDVLIYLVALAARCRVDLPLAVLSKMDI 
NRRR YPAHLARSSSRKYTELPHGAIS EDQAVGPADI PCDS TGQT 
ST 


6427 


145 


959 


AhS WG PPH VPKAQKMVS WM I CRL VVL VFGMLCPAYAS YKA VKTK 
WIRE YVRWMMYWIVFALFMAAE I VTD I F I S W FP FYYEI KMAFVL 
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ID 
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beginning 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

»- w - L *. cajuuxiu J.ny 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, GsGlycine, 
HaHistidine, I«Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, RoArginine, 
S=Serine, T*Threonine, V-Valine, 
W*Tryptophan, YeTyroeine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








WLLSPYTKGASIiLYRKFVHPSLSRHEKBIDAYIVQAKBRSYETV ' 
LS FG KRGLN IAAS AAVQAATKSQGALAGRLR S FSMQDLRS ISDA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 

VPRAPARPREKPLIRSQSLRVVKRKPPVREGTSRSLKVRTRXKT 
VPSDVDS 


6428 


1982 


444 


SGSGGKMEDHQH VP IDIQTS KLLDWLVDRRHCS LKWQSLVLTI R* " " 
EK1NAAIQDMPESEE1AQLLSGSYIHYFHCLRILDLLKGTEAST 
KNI FGRYSSORMKDWQEI IALYEKDNTYLVELSSLLVRNVNYEI 
PSLKKQIAKCQQLQQBYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGELLALVKDLPSQLAEIGAAAQQSLGEAIDVYQASVGP 
VCESPTEQVLPMLRPVQKRGNSTVYEWRTGTEPSWERPHLEEL 
PEQVAEDAIDWGDFGVEAVSEGTDSGISAEAAGIDWGIFPESDS 
KDPGGDGIDWGDDAVALQITVLSAGTQAPEGVARGPDALTLLEY 
TETRNQPLDELMELEI FLAQRAVELS EE ADVLS VSQFQLAPAI L 
QGQTKEKMVTMVSVLEDIi I GKLTSIjQLQHLFMl LAS PRYVDRVT 
EFLQQKIjKQSQLIiALKKELMVQKQQEALEEQAALEPKLDLLLBK 
TKELQKLI EADI SKRYSGRPVNLMGTSL 


5429 


3413 


3442 


epsswtaaprgpLaahpLeAavqeddrralsfdsrikvfangtl 

WKS VTD KDAGD YLCVARNKVGDD YVVLKVDVVMKP7VKI EHKEE 
NDHK VFYGGDLKVDCVATGLPNPE X S WS LPDGSLVNSFMQSDDS 

ggrtkrywfnngtlyfnevgmreegdytcfaenqVgkdemrvr 

VKWTAPATI RNKTCLAVQ VP YGDVVTVACEAKGE PMPKVTWLS 
PTNKVIPTSSEKYQXYQDGTLLIQKAQRSDSGNYTCLVRNSAGE ■ 
DRKTVWIHVNVQPPK1NGNPNPITTVREIJVAGGSRKLIDCKAEG 

iptprvlwafpegwlpapyygnritvhgngsldirslrxsdsv 
qlvcmarneg&earlivqltvlepmekpifhdpisekitamagh 

TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMLH 
ISGLSSVDAGAYRCVARNAAGHTERIjVSLKVGLKPEANKQYHNL 
VS I INGETLKLP CTPPGAG QGRFS WTLPNGMHLEGPQTLGR VSL 
LDNGTLTVREAS VFDRGTYVCRMETE YGPS VTS I PVI VIAYPPR 
ITSEPTPVT YTRPGNTVKLNCMAMGI PKADITWELPDKSHLKAG 
VQARL YGNRFLH P QGS LT I QHATQRDAG FYKCMAKN I LGS DS KT 
TYIHVF 


6430 


1946 


602 


RTRVSTGLRKTLLWSEAVGASSTRGDTGIPGSGEGGAGPGGGEG 
AMLEAMAEPSPBDPPPTLKPETQPPEKRRRTIEDFNKFCSFVLA 
YAGYIPPSKEESDWPASGSSSPLRGESAADSDGWPSAPSDLRTI 
QTFVKKAKSSKRRAAQAGPTQPGPPR5TFSRLQAPDSATLLEKM 
KLKDSLFDLDGPKVASPLSPTSIjTHTSRPPAALTPVPLSQGDLS 
HPPRKKDRKNRKLGPGAGAGKGVLRRPRPTPGDGEKRSR1KKSK 
KRXLKKAERGDRLPPPGPPOAPPSDTDSEEEEEBEEEEEBEEMA 
T WGG EAP VP VLPT PPEAPRPPATVHPEGVP PADSES KEVGSTE 
TSQDG DAS SS EGEMRVMDED I MVESGDDS WDLITC YCRKPFAGR 

PMIECSLCGTWIHLSCAKIKKTNVPDFFYCQKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNS S YNL PAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR — 
liEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGBKH 
RELRLRNYVPEDEDIiKKRRVPQAKPVAVEE KVKEQLEAAKPBPV 
I EEVDIANLAPRKPDWDLKRDVAKKLEKLKKRTQRAlAELIRER 
LKGQEDSLASAVDAATEQKTCDS D 


6432 


56 


1692 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
DYSDQEVLQTLTKFCFPFYVDSLTVSQVGQNFTFVLTDIDSKQR 
FGFCRLSSGAKSCFCILSYLPWFEVFYKLLNILADYTTKROENQ 
WNELLE TLHKLP I PD PG VS VHLS VHS YFTVPDTREL PS I PENRN 
LTEYFVAVDVNNMLHIiYASMLYERR ILI I CSKLSTLTACIHGSA 
AMLYPM YWQHVYI P VLPPHLLDYCCAPM P YLXGIHLS LME KVRN 
MALDDWILNVDTNTLETPFDDLQSLPNDVISSLKNRLKKVSTT 
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1 SECT 
ID 
NO: 

L 


Predicted 
beginning 
nucleotide 
location 

oponaing 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Predicted end 

nucleotide 
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to first 
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I residue of 

amino acid 

sequence 


Ammo acia segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F« Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, fc=Lysine, 
L»Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine l RsArginine, 
S=Serine, T=Threonine, VaValine, 
W*Tryptophan, Y«Tyrosine, X»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


^4*3" 






TQDGVARAFLKAQAAFFdSYRNALKIEPEEPITFCEKAFVSHYR 
5GAMRQFLQNATQLQLFKQFIDGRLDLLNSGEGFSDVFEEE1NM 
GB YAG S D KLYHQ WLS TVKKGSGAI LNTVKTK&NPAMKTVYKFD I 
AENGCAPTPEEQLPKTAPS PLVEaKDPKLRBDRRPITVHFGQVR 

PPRPHWKRPKSNIAVEGRRTSVPSPEQNTIATPATLHILQKSI 
TKFAAKFPTRGWTSSSH 


' 6434 


1524 


J 484 


APVTKRKEWAKDSKGSALDAGRDPKRPALPETiCESGWASNTA 
PTTPPQPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMIiL 
LPLLLLLVATTG PVGALTDEEKRLMVE LHNLYRAQVS PTAS DML 
HKRWDEELAAFAXAYARQGVWGHNKERGRRGENLFAITDEGMDV 
PLAMEEWHHEREHYNLSAATCSPGQMCGHYTQVVWAKTERIGOG 
SHFCEKLQGVEETNIELLVCNYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCEPIGSPEDAQDLPYLVTEAPSFRATEASDSRKMG 
AEGPDKPSWSGLMSGPGHVWGPLLGLLLLPPLVLAGIF 


I — 


40 


2002 


mpqi^fgmadptqmgqlsmlllagehalgtpbvfsgtcrpdVse^ 

S PELRQKS PLFQFAEI SS S TSHSDAS TKQOQTS ALFQFAB ISSN 
TSQU3GAEPVKRCGKSALFQIJMSMCliASE<^KMEESKLIKAKES 

dggrikelekgkeekeikmektdetrlqkeaefeksakbnlrds 

KELRMFEALQIDDIMAIKMEDPKEIRKEELEEDHKCSHFPDFSY 

SASSKI iisdvpsrkdhmchphgimi iedpaalnkpeklkkkkk 
kskmdrhgndkstpkktckkrqssesdiesviytibavakgdwg 

IEKU3DTPRKKVRTS3SGKGSILDAKPPKJOCVKSREKKMSKEKS 

sdttkesrppdfisisasknisgetpegikaepltpmedalpps 

LSGQAXPEDSDCHRKIEXCGSRKSERSCKGALYKTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNBESWTFSQSGTSGSKXFKK 
TKPKBDCIiLGSAKLDEEFBKKFNSLPQYSPVTFDRKCVPVPRKK 
KKTGNVSSEPTKTS KGSGDKWSNKQLFLDAIHPTEAI FSEDRNT 
MEPVHKVKNIPSI FNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQRAG YHH EEVLWMTNLMNNCGGVYLKQLRHTAM7NA 


6436 


2227 t 


657 


ALQRDAAAAYAHPEYEERFIiQEETVSQQINSIEiltiQtRPLALPE 
WKSQRPLQRQVHIiRGRPASQPTVIRGITYYKAKVSEEENDIEE 
QQDEFFSGDNGVDLIjIEDQLIiRHNGLMTS VTRRPAATRQGHS TA 
VTSDLNARTAPWSSALPQPSTSDPSIANHASVGPTLQTTSVSPD 

ptresvlqpspqvpatwahtato^paapappavsprealmbam 
htvpvppttvrtdslgkdapagrgttpasptlspeeeddirnvi 

GRCKDTLSTITGPTTQNTYGRNEGAWMKDP1JUGDERIYVTNYYY 

gntlvefrnlenfkqgrwsnsyklpyswigtghwyngafyynr 
aftrniikydlkqryvaawamlhdvayeeatpwrwqghsdvdfa 
vdenglmliypalddegfsqbvivlsklnaad^stqkettwrtg 
lrrnfygncfvicgvlyavdsynqrnanisyafdthtntqivpr 
llfeneyfyttqidynpkdrllyawdnghqvtyhvifay 


6437 


1295 


341 


t^CRPPVRQDPDSGPDYEALPAGATVTTHMVAGAVAGILEHCVM 

ypidcvktrmqslqpdpaaryrnvlealwrhrteglwrpmrgl 
nvtatgagpahalyfacye klkktls dvihpggnsh i angaag c 

VATLLHDAAMNPaE WKORMOM ymc p vun^pmprre m ^a*™ « „ 

AFYRSYTTQLTMNVPFQAIHFMTYEFtiQEHFNPQRRYNPSSHVL 

SGACAGAVAAAATTPLDVCKTIiLNTQESIALNSHITGHITGMAS 

AFRTVYQVGGVTAYFRGVQARVIYQIPSTAIAWSVYEFFKYLIT 
KRQEEWRAGK 




1828 — r 


3*0 . 


PPAPAPPASPARHVTRTARGHLEGGSRAPPLLO^VFLQIXNMVk — 

LI HTLADHGDDVNCCAFS FSLLATGS IiDKTIRL YSLRDFTELPH 

S PLKFHTYAVHCCCFS PS GH 1 LAS CS TDGTTVLWNTENGQMLAV 

MEQPSGSPVRVCQFSPDSTCIJVSGAADGTVVLWNAQSYKLYRGG 

SVKDGSLAACAFSPNGSFF\nCGSSCGDLTVWDDKMRCLKSEKAH 

DLGITCCDFSSQPVSDGEQGLQFPRLASCGQDCQVKIWIVSFTH 

ILGFELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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ID 
NO: 
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beginning 
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1 oca t" i ftn 

corresponding 
to first 
amino acid 
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amino acid 
sequence 
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nucleotide 
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to first 
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amino acid 
sequence 


Amino acid segment containing signal pept£3e"~ 
(AeAlanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=»Hlstidine, Islsoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, Threonine, v«Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTLLLATGSMDKTVNIWQFD 
LETLCQ ARST BHQLKQ FTE DW S E EDVSTWLCAQDLKDLVG I FKM 
NN I DGKELLNLTKES LADDLKI ESLGLRSKVLRKIEELRTKVKS 
L S S G I PDE F I CP I TRELMKDPVI ASDGYS YEKEAMENWD PAKRN 
RTSPP 


6438 


109 


901 


EVQ I LRAKMFQTGGIi IVFYGLLAQTMAQFGGLPVPLDQTLPLNV 
NPALPLS PTGLAGSLTNALSNGLLSGGLLGI LENLPL LD I LKPG 
GGTSGGLLGGLLGKVTSVI PGLNNI IDIKVTDPQLLELGLVQSP 
DGHRLYVTI PLGI KLQVNTPLVGAS LLRLAVKLD I TAE ILAVRD 
KQERIHLVLGDCTHS PGS LQI SLLDGLGPLP I QGLLDSLTG ILN 
KVLPELVOGNVCPLVNBVLRGLD1TLVHDIVNMLIHGLQFVIKV 


6439 


23 


412 


SIQTASAITTEMASQSQGIQQLLQAEKRAAEKVADARKRKARRL 
KQAKEE AQMBVEQ YRREREHE FQS KQQAAMGSOjGNtiSAE VEQAT 
RRQVQGMQSSQQRNRERVLAQLLGI^VCDVRPQVHPNYRISA 


6440" 


3 


517 


RARWNSDMGDLPGLVRIiS IALRIQPNDGPVPYKVDGQttF^QNRT 
IKLLTGSSYKVEVKI KPSTLQVENIS IGGVLVPLELKSKEPDGD 
RWYTGT YDTEGVTPTKSGERQPI QI TMPPTD IGTFETVWQVKF 
YNYHKRDHCQWGSPFSVIEYECKPNETRSIWWVNKESFL 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLIiGSVGDALGY 
RNVCKENSTVGMKIQEELQRSGGLDHLVLSPGEWP VS DNT I MH I 
ATAEALTTD YWCLDDLYREMVRCYVE I VEKL PERRPDPATI EGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 
LIEVSVE CGRMTHNHPTG FLG S LCTALF VS FAAQGKPLVQWGRD 
MIJRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 
DSENKAI FPDNYDAEEREKT YRKWSS EGRGGRRGHDAPM I AYDA 
LLAAGNS WTELCHRAMFHGGESAATGTI AGCLFGI*LYGLDLVP K 
GIiYQDLEDKEKLEDLGAALYRLSTEEK 


6442 


34 


796 


aedpagglagqdtmfarglkrkcvgheedvegalaglktvSSyS 
lqrqs lldmslvklqlchm lve pnlcrs vli ajjtvrqiqeemtq 
dgtwrtvapqaaerapldrlvsteilcraawgqegahpasglgd 

GHTQGPVSDLCPVTSAQAPRHLQSSAWEMDGPRENRGSFHKS LD 
Q IFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSSSCKSDLGELDHWEILVET 


6443 


2 


555 


MASPAASSVRPPRPKIOilPQTLVlPKNAAEEQKLKLERI^lKNPDK 
AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHLRRR 
EYQRQDYMDAMAEKQKLDAEFQKRLEKNKIAAEEQTAXRRKKRQ 

VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPI PE P KPGDL I E I FRPFYRHWAI YVGDGYVVHLA 
P PS EVAGAGAAS VMSALTDKA I VKKELLYDVAGSDKYQVNN KHD 
DKYS PLPCS KI IQRAEELVGQE VL YKLTS EWCEHFVNELR YGVA 
RSDQVRDVI IAAS VAGMGIAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


AGAAGAAGAARS PRPQAHTKGVRGLP SRRRSPDCGRMELAAGS F 
SEEQFWEACAELO^PALAGADWQLLVETSGISIYRLLDKKTGLY 
EYKVFGVLEDCSPTLIiADIYMDSDYRKQWDQYVKELYEQECNGE 
TVVYWEVKYPFPMSNRDYVYLRQRRDLDMEGRKIHVILARSTSM 
PQ LGERSG VI R VKQ Y KQSIiAI ES DGKKGS KVFM YYFDNPGGQI P 
SWliINWAAKNGVPNFLKDMARACQNYLKKT i 


6446 


1 


1651 


RCPTRSPPPbTPGSRGTTAMCSLASGATGGRGAVENEEDLPECs" 

DSGDEAAWEDEDDADLPHGKQQTPCLFCNRLFTSAEETFSHCKS 

EHQFNIDSMVHKHGLEFYGYIKLINFIRLKNPTVEYMNSIYNPV 

PWEKEEYLKPVLEDDLLLQFDVEDLYEPVSVPFSYPNGLSENTS 

WEKLKHMEARALSAEAALARAREDLQKMKQFAQDFVMHTDVRT 

CSSSTSVIADLQEDEDGVYFSSYGHYGIHEEMLKDKIRTESYRD 

FI YQNPH I FKDKWLDVGCGTG1 LSMFAAKAGAKKVLGVDQSE I 
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sequence 


Ammo acid segment containing signal peptide"" 
(A«=Alanine, C=Cyeteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G^Glycxne, 
H=Histidine, I=Isoleucine, K=Lysine, 

i'1-piecxiionjLne, w«sAsparagine, 
PaProline, Q«Glutamine, R«Arginine, 
S»Serine, T«Threonine, V-Valine, 
W«Tryptophan, Y-Tyrosine, X«Unknown, *=stop 

CodOn. /nDflQR^hl A mini anHi'^B 1 « *. < 

' / "ywooiuic nucAeociae cej.ec ion, 
\=possible nucleotide insertion) 








liYQAMDI IRLNKLEDTITLI KGKIEE VHLPVEKVDVl ISEWMGY 
FLLFESMLDS VL YAKNKYLAXGGSVY PD I CTISLVAVSD VNKHA 

DRIAFWDDVYGFJCMQr , MVvavTDi?&TnrpT7T nnirmr Tonn^TTm 
i><vxnc nuuv tur AJ^ov^iv.ivAVi.^jiAV vfc. VliDPiCTLISEPCGIKH 

I DCHTTS I SDLEFSSDFTLKITRTSMCTAIAGYFD1 YFEKNCHN 

R WFSTGPQS TKTHWKQTVFLLEKPFS VKAGEALKGKVTVHKNK 

KDPRS LTVTLTLNNSTQT YGLQ 


6447 


1554 


1068 


RLGPAEWHLSGPCHATIiC^ANRGRALGVRAAWRGA&LCQR^P 
SRTNLATG I PSS KVKYSKLSS TDDGYIDIiQFKKTPP KI PYKAIA 
LATVLFLIGAFLIIIGSLLLSGYISKGGADRAVPVLIIGILVFL 
PGFYHLRIAYYASKGYRGYSYDDIPDFDD 


6448 
6449 


74 


559 


GQ VLSHC YH YRS SRWRRGGLSRGRGAGVMALVPYE ETTE FGLQK 
FHKPLAT FS FANHT IQ IRQDWRHLGVAAWWDAA I VLS TYLEMG 
AVELRGRS AVEIXSAGTGLVGI VAALLACR I R YERDNN FLAMLER 
QFI VRKVH YDPEKDVHI YEAQKRNQKEDL 




597 


1876 


eygvcenlrklkXtgvscrdVYakli.hryrhIlglwqpdigpyg— 

gllnvvvdglfiigwmylpphdphvddpmrfkplfrihi,merka 

atvecmyghkgphhghiqivkkdefstkcnqtdhhrmsggrqee 

frtwlreewgrtledifhehmqelilmkfiytsqydncltyrri 

ylppsrpddlikpglfkgtygshgleivmlsfhgrrargtkitg 

DPN I P AGQQTVE 1 DLRHRIQLPDIjENQRNFNELSRI VLE VRERV 

rqeqqeggheagegrgrqgpresqpspaqpraeapskgpdgtpg 
edggepgdavaaaeqpaqcgqgqpfvlpvgvssrnedyprtcrm 
cfygtgliaghgftspertpgvfilfdedrfgfvwlelksfsly 

v yAi JTKWADAPSPQAFDEMLKN I QSLTS 


6450 


848 " 


269 


Wpaprtvsgkrslpgeweergegeqrtgrefsgnggraveaar 

MRLLCGLWLWLSLLKVLQAQTPTPLPIiPPPMQSFQGNQFQGEWF 

vlglagnsfrpehrallnaftatfelsddgrfevwnamtrgqhc 
dtwsyvlipaaqpgqftvdhrvwtheqagrpqdqpagqelvaas 

HDARPVHT.t>f3nccr , DTj^ 
e v nu e \y y o a \d It iAy 


6451 


232 


939 


HbPTPPTSPRASTMEDVKLEFPSLPQCKEDABEWTYPMRREMQE 
ILPGLFLGPYSSAMKSKLPVLQKHGITHIICIRQNIEANFIKPN 
F^QLFRYLVLDIADNPVENIIRFFPMTKEFIDGSLO>K3GKVLVH 
GNAGISRSAAFVIAYIMETFGMKYRDAFAYVQERRFCINPNAGF 
vn * iJ w £ ' * c>t\± i liaiujI iUrlrj&t'IjQIERSIiS VHSGTTGSLKRTHE 
EEDDFGTMQVATAQNG 


6452 
6453 ' 


1 


652 


KTRGESSNMEPIAAYPI^dS^PRAKVFAVLLSIVLC^LFLLQ 
LKFI^PKINSFYAFEVKDAKGRWSLEKYKGKVSLVVNVASDCQ 
LTDRN YI/3LKELHKE FG PSHFS VLAFPCNQ FGESEPR p SKEVES 
FARKNYGVTFP X FHKT TT^QPriwoa m m xrr\c e inmnnr L.m.„ 

YLVNPEGQWKFWRPEEPI EVIRPDIAALVRQVI IKKKEDL 




827 


223 


HKRWIiPGLSMSPRRTLPRPLSLCLSLCLCLCLAAAI^SAQSGSC 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
D PG I YKCWCGTPLFKSETKFDSGSGW PS FHDV1NSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIPDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6454 


827 


223 


HRRWLPGLSMS PRRTLPRPLSLCLSIiCIjCLCLAAALGSAQSGS c 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCIKSAALSFT 
PADS SGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


RVHUVTVSASAAWDALGLPVRSHMQGSTRRMG\^TDVHRRFLQL 
LMTHG^EEWDVKRLQTHCYKVHDRWATVDKLEDFIMTINSVIiE 
SLYI E I KRG VTEDDGRP I YALVNLATTS I S KMATDFAENELDLF 
RKALELI IDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTIiHGRAILEMEQYIRETYPDAVKlCNICHSL 
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amino acid 
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Predicted end 
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Amxno acid segment containing signal peptide 
(A*Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleuciae, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q»Glutaraine, R-Arginine, 
SaSerine, T»Threonine, VoValine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








L I QGQS CET CG IRMHLPCT/AKY FQSNfAEPRCPHCNDYW Ptt Kl P K 
VFD PEKERESGVLKSNKKS LRS RQH 


6456 


2 


555 


RPQSRS ISMWRNSLLQVSSGLRWLRVCAMVDILGERHLVTCKGA 
TVEAEAALQNKVVALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAI P KLVI VKQNGEVI TNKGR KQ I RE RGLAC FQD WVEAA 
DIFQNFSV 


6457 


23 


B92 


PTTGFPVTNFPWNWPDGKPPIMILWSKLNKil^FPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLIjET 
1 1 LGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGY I FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI iptli ISVSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLHYSTVLCSYYNSAIj 
TTAVVGAI KNVSVAY IG I L IGGD Y I FSLLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6458 


23 


892 


PlTGFP^FPWNWPDGKPPIMiL^^mJKIilkFPDFDKKIPV " 

KLFPLPLLYVGNHISGI*SSTSKLSLPMFTVLRKFTIPLTLLLET 

I I LGKQYSLNI I LSVFAI ILGAFIAAGSDLAFNLEG YI FVFLND 

I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI I S VSTG 

DLQQATEFNQWKNVVFILQFLLSCFJ^FXL^STVLCSYYNSAL 

TTAWGAIKNVSVAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 

SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 


892 


PTTGFPVT^FPt^fapt^^PPlMlLWSKLtJKllHF'PbFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVIiRKFTIPLTLIiLET 
1 1 LGKQYSLNI ILSVFAI I LGAFI AAGS DLAFNLEGY I FVFLND 
I FTAANG VYT KQ KMD P KE LGKYGVL FYNAC FM I I PTLI I SVSTG 
DLQQATEENQWIQJVVPIIiQFLLSCFIiGFIiIjMYSTVLCSYYNSAL 
TTAWGAI KNVS VAYIG I LIGGDYI FSLLNF VGLNI CMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6460 


23 


B92 


PTTGFPVTNFPWNWPDGKPPIMtLWSktNKltHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I I LGKQYSLN I I LSVFAI ILGAFIAAGSDLAFNLEGY I FVFLND 
I FTAANG VYT KQ KMD PKELGKYGVLFYNACFM 1 1 PTLI ISVSTG 
DLQQATEPNQWKNVVFILQFLIiSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAIKNVSVAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 
SFLTLS SQLKPKP VGEEN I CLDLKS 


6461 


1653 


360 


LQQRTLR ITAVGQTHP I AWMAWfe PSL6 AFYGPAS F ITFVNCM YF 
LS IFIQLKRHPERKYELKEPTEEQQRLAANENGE INHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATSLS FS AFFWHHCVNREDVRLAWI MTCCPGRS S 
YS VQVNVQPPNSNGTNGEAPKCPNSSAES SCTNKS ASSFKNS S Q 
GCKLTNLQAAAAQCHANSL PLNSTPQLDNSLTEHSMDND I KMHV 
APLEVQFRTNVHS SRHHKNRS KGHRASRLTVLRE YAYDVPTS VE 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
ctiLtVtJ^^Ki\t CiAi'vol loJUUJALRivPAVVELEWQQKSYGLNLAI 
QNGPI KSNGQEGPLLGTDSTGNVRTGLWKHETT V 


£462 


3 


773 


SEELDREKKLKEDS PRKTPNKESGVPSLPVSLTS I KEEPKEAKH 
PD SQSMEES KLKNDDRKTPVNWKDSRGTRVAVS SPMS QHQS Y I Q 
YLHAYP YPQMYDPSHPAYRAVS PVLMHS YPGAYLS PGFHYPVYG 
KMSGRE ETE KVNTS PS VNTKTTTES KALDLLQQHANQYRSKS PA 
PVEKATAERBREAERERDRHSPFGQRHLHTHHHTHVGMGYPLI P 
GQYD PFQGLTSAALVASQQVAAQASASGMFPGQRRE 


6463 


2 


350 


VILCILGGWIFKNADRSMBKKKGEPRTRAEARPWVDEDLKDSSD 
LHQAEEDADE WQESE ENVEH I P FSHNH YPEKEMVKRS QEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


G I LRQKEREERNR I HKKEI LFLEHLL WPSEMSS LSGKVQTVLG 
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alUJ.no dClu 
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amino acid 
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Predicted end 
nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I*Isoleucine, K=Lysine, 
^Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q*Glutamine, Rx^Arginine, 
S=Serine, TVThreonine, Valine, 
W-Tryptophan, Y= Tyrosine, X-Unknown, +«Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LVEPSKLGRTLTHBHIiAMTFDCCyCPPPPCQEAISKEPIVMKNL 
YW I Q KNAY S H KENLQLNQ ETEAI KE ELLY FKANGGGALVENTTT 
G iSRDTQTLKRIiAEETGVHIISGAGFYVDATHSSETRAMS VEQL 
TDVLMNEILHGADGTS IKCGI IGE IGCSWPLTESERKVLQATAH 
AQAQLGCP VI IHPGRS SRAPFQ 1 1 RX LQEAGADIS KTVMSHLDR 
T I LDKKBLLE FAQLGC YLE YDLFGTELLHYQLGPD I DMPDDNKR 
I RRVRLLVEEGCEDRI LVAHDIHTKTRLMKYGGHGYSHIIiTNW 
PKMLLRGITENVLDKILIENPKQWLTFK 


5465 


126 


1396 


KMTVFFKTLRNHWKKTTAGLCl*IiTWGGHWLYGK^CDWLLRR5iAC ' 
QEAQVFGNQLIPPNAQVKKATVFLNPAACKGKARTLFEKNAAPI 
l^LSGMDVT I VKTDYEGOAKKLLELMENTDVI IVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLAI VKG ETVPLD VLQ I KGEKEQP VFAMTGLRWGS FRDAG V 
KVS KYW YLE P LK I KAAH FFSTLKEW PQTHQAS I S YTGPTERPPN 
EPEETPVQRPSLYRRIIjRRLASYWAQPQDALSQEVSPEVWKDVQ 
LSTIELSITTRNNQLDPTSKEDFLNICIEPDTISKGDFITIGSR 
KVRNPKLHVEGTECLQASQCTLLI PEGAGGS FS IDSEE YEAMPV 
EVKbLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARG TELSQLEKAH P PADMGRRKS KRKPPPKKKMTGTLETQFTC 
PFCNHEKS CD VKMDRARNTG V IS CTVCLEEFQT PITYLS E P VD V 
YSDWIDACEAANQ 


6467 


301 


2571 


GELRVLALAHGELACHAVLTASLLSLRSRLMDSDMDYER PNVET 
I KCVWGDNAVG KTRL I CARACNATLTQYQLLATHVPTVWAI DQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLC FS I AN PNSLHHVKTMWYPE I KHFCPRAP VI LVGCQLDLRY 
ADLEAVNRARRPLARPIKPNEILPPEKGREVAKELGIPYYETSV 
VAQFGIKDVFDNAIRAALISRRHLQFWKSHLRNVQRPIiLQAPFL 
PPKPPPPIIVVPDPPSSSEECPAHLLEDPLCADVILVLQERVRI 
FAHKI YLSTS SSKFYDLFLMDLSEGE LGGPSE PGGTHPEDHQGH 
SOQHHHRHHHHHGROFLLRAAS FDVCES VDEAGGSGPAGLRAST 
SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPI.TYKSR 
LMVWKMDSSIQPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFDLRMMVANILNNEAFMNQEITKAFHVRRTNRVKECLAKGT 
FSDVTF IkDDGTISAHKPLL IS SCDWMAAMFGGPFVES STREW 
FPYTSKSCMRAVLE YLYTGMFTSSPDLDDMKIiI IliANRLCLPHIi 
VALTEQYTVTGLMEATQMMVD IDGDVLVFLELAQFHCAYQLADW 
CLHHICTNYNWCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 
EDHYQRARKEREKEDYLHLKRQ PKRRWLFWNS PS S PSSSAA5 SS 
SP5SSSAW 


6468 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRLLPMLGIal^LLAEPG 
LGRVHHLALKDDVRHKVHLNTFGFFKIX5YMVVNVSSLSLNEPED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCIIiKKQSVSVTLLlLDI 
SRSEVRVKSPPEAGTQIiPKI I FSRDEKVLGQSQBPNVNPAS AGN 
QTQKTQDGGKS KRSTVDS KAMGEKS FS VHNNGGAVS FQFFFNIS 

GEIPLPKIiYISMAFFFFljSGTIWIHILRKRRNDVFKIHWLMAAli 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVtiANVAYIIIBS 
TEEGTTEYGLWKDSLFIiVDLLCCGAILFPVVWSIRHIiQEASATD 
QKGKFS RAHEVLLS hit 


6469 


3 


1374 


DAWAGTNMAALAPVGS PASRGPRLAAGLRLLPMIX3LLQLLAEPG 
LGR VHHLAL KD D VRHKVHLNTFG F FKDG YMWNVS S I>S LNE P ED 
IQVTIGFSLDRTKNDGFSSYl^EDVNYCILKKQSVSVTLLILDI 
SRSEVRVKS PPEAGTQLPKI I FSRDEKVLGQSQBPNVNPAS AGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFS VHNNGGAVS FQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA | 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCyeteine, D»Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alani ne , G=Glycine, 
n niotioine, A~isoieucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, VM/aline, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *-stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








GBIPLPKLYISMAFFFFI^GTIWIHIl^XRRNDVFKIHWtMAAL 
PPTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALlGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYrilES 
i JLHAa I TE X GLiWKDS h FIjVDLLCCGaILF PWWS I RHLQE AS ATD 
GKGKFSRAHFVLLSLL 


6470 


272* 


1437 


AAASGVSSRADAPVLAQSPASAGNGRPSTPRVPGSRRHPSAPRS 
GPL P REDGCRT PGPQLLPLPGALLR PRTLLS SAAETGRS RHPDT 
QHPSSGGRCRGGTBSPSSAAGRPASMAEAEEDCHSDTVRADDDE 
ENESPAETDLQAQLOMFRAQWMFELAPGVSSSNLENRPCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPD I EFKI TYTRSPDGDGVGNS YI EDNDDDSKMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
PYTSWREMFLERPRVRFDGVYISKTTYIRQGEQSUX3FYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


233 


FFFPKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GPRNKKRGWRRLAQEPIiGLEVDQPLEDVRLQERTSGGLLSEAPN 
EKLFFVDTGS KEKGLTKKRTKVQKKSLLLKKPLRVDiLILENTSK 
VPAPKDVLAHQVPNAKKLRRKEQLWEKLAKQGE LPREVRRAQAR 
LLNPS ATRAK PG PQDTVBRP FYDL WASDNPLDRPLVGQDE FFLE 
QTKKKGVKRPARLHTKP S QAPAVE VAPAGAS YNPS FEDHQTLLS 
AAHEVELQRQKI^KLERQLAIiPATEOAATQES TFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKJCTEQQRRREKA 
VHRLRVQQAALRAARLRHQELFRLRGI KAQVALRLAELARRQRR 
RQARREAEAD K PRRLGRL KYQAPD I D VQLS S ELTDS LRTLKPEG 
NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFRErQL 


6472 


3 


897 


S CGS DRAQWAME FPFDVDALFPERITVLDQHLRP PARR PQTTT P 
ARVDLQQQIMTI IDELGKASAKAQNLSAPITSASRMQSNRHWY 
ILKDSSARPAGKGAI IGF UCVGYKKLFVLDDRBAHNEVEPLCIL 
DFYIHESVQRHGHGRELFQYMIiQKBRVEPHQIiAIDRPSQKLLKF 
IiNKHYNLETTVPQVNNFVI FEGFFAHQHRPPAPSLRATRHSRAA 
AVDPTPAAPARKLPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 

RAPPP&TDDSVDDDDCCOT /'VTDMnnnr nnmm 


64 73 


22 


312 


SSAVEFVWEGEKMAAE PNKTE I QTLFKRLRAVPTNKACFDCGAK 
NPSWASITYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFQ 
LRCMQVGGNANATAFFRQHGCTANDANTKYNSRAAQMYREKIRQ 
wwonfumr.no i i^iJ« iUiMinooAv irWnortiWVJUoUr FTEHTQPPAW 
DAPATE PSGTQQPAPSTES SGLAQPBHG PNTDLLGTS PKAS LEL 
KSSI IGKKKPAAAKXGLGAKKGLGAQKVSSQS FSE I ERQAQVAE 
KLRBQQAADAKKQAEESMVASMRLAYQELOIDR 


6474 


3 


46-2 


LQRQRQHPAAAPAVPVRCFTFCFTDlVIMPKRkSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKE EKQEAGKEGTAPS BNGETKAEE IHI SRSTVNVSTSRGTP 
PSTLSVKGQI ETVRVKGTBN 


6475 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFl'DIVIMPKRKSPENTEGKDG'S 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6476 


106 


1030 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMEVLkQRIAEETII, 
KSQVDKRFS AHYDAVEAELKSS TVGLVTLNDMKARQEALVRERE 
RQLAKRQHLE EQRLQQERQREQEQRRERKRKI SCLSFALDDLDD 
QADAAEARRAGNLGKNPDVDTS FLPDRDREE EENRLR EELRQ E W 
EAQREKVKDEEMBVTFSYWDGSGHRRTVRVRKGNTVQQFUCKAL 
QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDF 1 I ARARGK 
SGPLFSFDVHDDVRLLSDATMEKDESHAGKWLRSWYEKNKHIF 
PASRWEAYDPEKKWDKYTIR 



509 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co r re sp onding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

P=Proline, Q=Glutamine, R^Arginine, 
S»Serine, T=Threonine, V»Valine, 
W»Tryptophan, Y«Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6477 


227 


915 


LQGHLMG I MAASRPLSRFWEWGKNI VCVGRNYADHVREMRSAVL 
SEPVLPLKPSTAYAPEGSPILMPAYTRNLHHBLELGVVMGKRCR 
AVPEAAAMDYVGGYALCIiDMTARDVQDECKKKGLPWTIAKSPTA 
SCPVSAFVPKEKIPDPHKLKLWLKVNGEIiRQEGETSSMIFSIPy 
1 1 S YVSK1 1 TLEEGDI I LTGTP KGVGPVKENDE I EAG 1 HGLVS M 
TFKVEKPEY 


5478 


2 


1495 


c voonxjjrnoijAsoriAaxijifiMWUKJUSiiaJJJCSSWKKO/ETNIRKTF 
IFMEVLGSGAFSEVFLVKQRLTGKLFALKCIKKSPAFRDSSLEN 
BIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RG VYTE KDASLVI QQVLSAVKYLHENGI VHRDLKPENLLY LTPE 
ENSXIMITDFGI*S KMEQNG I MS TACGTPG YVAPEVLAQKP YS KA 
VDCWSIGVITYILLCGYPPFYEETESKLPEKIKEGYYEFESPFW 
DDISESAKDFICHLLEKDPNERYTCEKALSHPWIDGNTALHRDI 
YPS VS LQIQKNFAKS KWRQAFNAAAVVHHMRKLHMNLHS PGVRP 

*j v BiiArra lulwu ipulSfOoJrCii X -i. l KAirV iiUno VAIlPAllTQIjPC 

QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSLAAGPCGCC 
SSCXNIGSKGKSSYCSEPTLLKKANKKQNPKSEVMVPVKASGSS 
HCRAGQTfiVCLIM 


6479 


3 


949 


SCRGPGWKPAGGQAGAM E LLSALS LGELALS FS RVPLFP V FDLS ~ 
YFIVSILYLKYEPGAVELSRR1IPIASWLCAMLHCFGSYILADLL 
I^EPLIDYFSNNSSILLASAVWYLIFFCPIiDLFYKCVCFIiPVKI, 
I FVAMKEWRVRKI AVG I HHAHHHYHHGWFVM I ATGWVKGS G VA 
LMSNFEQLLRGVWKP ETNE I LHMS FPTKASLYGAI LFTLQQTRW 
LPVSKAS L I FI FTLFMVSCKVFLTATHSHSS P FDALEG YI CPVL 
FGSACGGDHHHDNHGGS HSGGGPGAQHS AMPAKS KEEIiS EGSRK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPPYLRSAKMTEVMmTQPMEEIGLSPRKDGLS~ 

QIPPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFJLi 

VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVI AGGTLAI P I LAF VAS FLLWPS AL 1 ft I YYW Y 
WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPS I LMLHGFSAHKD 
MWLSWKFLPKNLHLVCVDMPGHEGTTRSSLDDLS IDGQVKRIH 
QFVECLKLMKKPFHLVGTSMGGQVAGVYAAYYPSDVSSIjWLVCP 
AGLQYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMLQLC 
SYVRFKVPQQILQGLVDVR1PHNNFYRKLFLEIVSEKSRYSLHQ 
NMDKIKVPTQI IWGKQDQVLDVSGADMLAKS IANCQVELLENCG 
HSWMERPRKTAKLI IDFLASVHNTDNNKKLD 


6482 


2517 


568 


epvskvsqsrrkagvptanieesqaveaamaWpwaevcekfqa 

ALALSRVELHKNPEKEPYKSKYSARALLEEVKALLGPAPEDEDE 
RPEAEDGPGAGDHALGLPAEWEPE3PVAQRAVRLAVIEFHLGV 
NHIDTEELSAGEEHLV^nT.BT.T,T?T?VDT.CirnPTaT.PTn7i.riTkTKiry5 , r 

LWSEREE I BTAQAYLESSEALYNQYMKEVGSPPLDPTERFIjPEE 
EKLTEQERSKRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 
KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 
FGQTGKISATEDTPEAEGEVPELYHQRKGEIARCWIKYCLTLMQ 
NAQLSMQDNIGELDLDKQSELRALRKKELDEEESIRKKAVQFGT 
GE liCDAI SAVEE KVS YIiRPLDFEEARBLFLLGQHYVFEAKEFFQ 
IDG YVTDH I E VVQDHS AIiFKGIAFFETDMERRCKMHKRRI AMLB 
PLT VDLNPQ YYLLVNRQ IQFE IAHAYYDMMDLKVAI ADRLRDPD 
SHI VKKINNLNKSALKY YQLFLDSLRDPNKVFPEH IGEDVLRPA 
MLAKFRVARLYGKI ITADPKKELENLATSLEHYKFI VDYCEKHP 
EAAQEIEVELEliSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCXJLRARAPLSANGREARAMEQRi^^ " 
PAASO^QTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQPO/3STS ETPWEJTAIPLPSCWDQS FLTNITFLKVTjtiW 
LVLLGLFVBLEFGLAYFVLSLFYWMYVGTRGPEEKKEGBKSAYS 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, ^Phenylalanine r G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine r 
P« Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 








VFtf PGCEA I QGTLTABQLERELQLRPLAGR 


64B4 


201 


965 


QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGBSIG^PFCQRLF 
MI LWLKG VKFNVTT VDMTRKPEE LKDLAPGTNPPFLVYNKEL KT 
dfikieefleqtiappryphlspicykespdvgftat.pak^qavtfr 
NTQKEANKNFEKSLLKEFKRLDDYLNTPIjLDEIDPDSAEEPPVS 
RRLFLDGDQLTLADCSLLPKLNI I KVAAKKYRDFD I PAE FSGVW 
RYLHNAYAREEFTHTCPEDKEIENTYANVAKQKS 


6485 


6 


j 1091 


FVDLVRAVEFLPCPDSQKLBKECQSSEESMGSNSMRSILEEDEE 
DEEPPRVLI»YHEPHSPEVGMl^UlirHVWlfTfVDi?t«JDJiTnriire\ror\t>nv' 

KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSIilTDYRVRLGOGSFAGSFLEYYAADISYPVRKSlQ 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRKMIjPDRSRAARD 

CVETYLEDEGQLDLWKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VLLPEAIICAISAGDEVDYKTAEEKYIKGPSLSYREKEIFDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLSPSRVTQGIYYMLAFSEMPKPPDYSELSDSLTIiA 
GGTGRFSGPLHRAWRMMKTFR fJTJMG W T ft VftT.V t.t acasi t?vvi;cu 

1 S ETYNRLALEH IQQHPEE PLEGTTWTHS LKAQLLS LP FWVWTV 
I FLVP YLQMFLFL YSCTRADPKT VGYC 1 1 PI CLAVI CNRHQAF V 
KASNQISRLQLIDT 


6487 


352 


863 


S FLKPLRGKMS VTLHTDVGDI KI E VFCERTPKTCEN FLALdHASN 
YYNGCIFHRNIKGFMVQTGDPTGTGRGGNSIWGKKFEDEYSEYL 
KHNVRG WSMANNGPNTNGSQFF I TYGKQPHLDMKYTVFGKVID 
GLETLDELEKLPVNEKTYRPLNDVHI KDITIHANPFAQ 


6488 


878 


241 


TALQEFGTSGPPLSLRFAIiPSGTGRFKPIjPGARGPSWPPSPRVP" 
ME PPNLYPVKLYVYDLS KGLARRLS P T MI/SKfYI .W5TWWT c T\nm 

KDEF F FGSGG I S SCPPGGTLLGPPDS WDVGSTE VTEE I FLE YL 
SS LGESLFRGEAYNLFBHNCNTFSNE VAQFLTGRKI PS Y I TDLP 
SEVLSTPFGQALRPLLDS IQIQPPGGSS VGRPNGQS 


6489 


1457 


375 


KVAKI^TALSEEELDNEDYYSLLNVRftEASSEELKAAYRRLCML 
YHPDKHRDPELKSQAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWE WERRRTPAE3 REEFERLQREREERRliQQRTNPKGT 
I SVGVDATDLFDR YDEE YEDVSGSS FP Q I E INKMH I S QS I EAPL 
TATDTAI LSGSLSTQNGNGGG3 INFALRRVTSAKGWGELE FGAG 
DLQGPLFGLKLFRNLTPRCFVTTNCALQFSSRGI RPGLTTVLAR 
NLDKN WG YLQWHCSSPI»LQVQRPHRNTRACAPE PS FRPFLHVP 
TWUAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPTETMFVARS iaadh 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGSVWRVTWAHPE FGQ VLAS CSFDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTS VTDVKFAPKHMGLMLATCSADGI VRI YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAKSPMIAVGSD 
DSSPNAMAKVQI FEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHIIJVIATKDVR1FTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQVWRVSWNITGIVLASSGDDGCVRLWKAl^DNWKCTGIIi 
KGNGSPVNGSSQQGTSNPSLG SNIPS LQKSLNGSS AGRKHS 


6491 


3 


1183 


HEAGCEVWLGYG PRAAAAAAATVLFGGAGPTETMFVARS IAADH 
KDL IHDVS FDFHGRRMATCS SDQS VKVWDKSESGDWHCTAS WKT 
HSGS VWRVTWAHPEFGQVLAS CS FDRTAAVWEEIVGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVK FAPKHMGLMIATCS ADGI VR I YE 
APDVMNLSQWSLQHErsCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTPJCYAKAETLMTVTDPVHDIAFAPNL 
GRSFII I LA IATKDVRI FTLKP VRKELTS S GGPTXFE IHIVAQFD 
NHNSQWRVSWNITGT\^SSGDDGCVRLWKANYMDNWKCTGIL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, Dispart ic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
Ht=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaroine, fc«Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 


6492 


34 


2573 


KGNGSPVNGSSQQGTSNPSU3SNI PSLQNSLNGSSAG^KHS 

XPKLKSCCCCCLFDFPPPPLDQVQEEBCEVBRVl-EHGTPKPFRK ' 
PDS VAFGE SQSEDEQFENDLETDPPNWQQLVS REVLLGLKPCE I 
KRQE VINE LFYTERAHVRTLKVLDQVFYQRVS REG I LS PS ELRK 
I FSNLEDI LQLHIGLNEQMKAVRKRNETS VIDQ IGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDIIPT^MQRLTKYPLLLDSIATYTBWPTEREKVKK 
AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEYPN 
VEELRNLDLTKKKMIHEGPLVWKVKRDKTIDLYTLLLEDILVLri 
QKQDDRLVLRCHSKILASTADS KHTFS P VIKL S TVLVRQVATDN 
KALFVISCVISDNGAQIYELVAQrVSEKTVWQDLICRMAASVKEQS 
TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 
LESTLI5SKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 
EVGE DYQI A I PDSHLPVS EERWALDALRNLGLLKQLLVQQLGLT 
EK5VQEDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSGEGHMP 
FRTGTGDIATCYS PRTSTES FAPRDS VGLAPQDSQASNILVMDH 
MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 
EEKD VNLR I SGNYL I LDGYDP VQESSTDEEVAS S LTLQPMTG I P 
AVES THQQQH S PQNTHS DGAI S PFTPEFLVQQRWGAMEYS CFE I 
QSPS S CADSQSQ IMEYTHK IEADLEHLKKVEES YT I LCQRLAGS 
ALTDKHSDKS 


6493 


557 

\~ 2425 


1147 


TPARMAYQGSSTSDCMSKTLDSASAHFAASAWSAPVPSRSEVA 
KEQNTGHNNINGWQPSGTS KTLYS TNMALSSS PG I S AVQLVRT 
VGHTTTNHLI PALCTSSPQTLPMNNSCLTNAVHLNNVSWSPVN 
VH INTRTSAPS P TALKLATVAASMDRVPKVTPSSAI S S IARENH 
EPERLGLNGIAETTVAMEVT 


6494 




1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPI LMEKEEEGMLSPILAHGG 
VRFMW I KHNNL YLVATSKKNAC VSLVFS FLYKWQVFS E Y FKEL 
EEES IRDNFVI I YELLDELMDFGYPQTTDSKILQBYITQEGHKL 
ETGAPRPPATVTNAVSWRS EG I KYRKNEVFLDVI BSVNLLVSAN 
GNVLRS E I VGSIKMRVFLfiGMPELRLGLNDKVLFDNTGRGKS KS 
VELEDVKFHQCVRLSRFENDRTISFI PPDGEFELMSYRIAJTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPH 
DADSPKFKTTVGSVKWVPENSEI VWS IKSFPGGKE YLMRAHFGL 
PS VEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKI IEKSGYQAL 
PWVR Y I TQNGDYQLRTQ 


6495 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAiKSASAVVvtD 
LKGKVLICRNYRGDVDMSBVEHFMPIIiMEKEEEGKLSPILAHGG 
WFMWI XHNNLYLVATSKKNACVSLVFS FL YKWQVFSEYFKEL 
EEES IRDNFVII YBLLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPR P PAT VTNAVS WRS EG I KYR KNEVFLDVIES VNLLVSAN 
GNVLRS E I VG 3 1 KMR VF LS GMP ELRLGLNDKVIjFDNTGRG KS KS 
VELED VKFHQ CVRL S R FENDRT I S F I PPDGE FELMS YRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSE I VWS IKS FPGGKEYLMRAHFGL 
PSVEABDKEGKPP I SVKFE I PYFTTSGIQVRYLKI IEKSGYQAL 
P WVR Y I TQNGDYQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPSYSIHSLFCIMFLCAQEWLTLGLNVPtiLFY~ 

HFWRYFHCPADSSEIAYDPPVVMNADTLSYCQKEAWCKLAFYLL 

SFFYYLYCMIYTLVSS 


6497 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGESGV - 
GKTNLLSRFTRN6F5 HDSRTTIGVEFSTRTVMLGTAAVKAQ I WD 
TAGLERYRAITSAYYRGAVGALLVFDLTKHQTYAWERWLKSLY 
DHAEAT IWMLVGNKSDLSQAREVPTEEARMFAENNGLLFLETS 
ALD STNVBLAFE TVLKBI FAKVS KQRQNSIRTNAI TLGSAQAGQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C-Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K-Lysine, 
L«Leucine, M=Methionine , NsAsparagine , 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine , VsValine, 
WaTryptophan, YeTyrosine, X=Unknown , *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








E PG POEKRACCI SL 


6498 


2636 


272 


SLRLCPWGTHIAGPTTMRLSSLLALLRPALPLILGLSLGCSLSt 
LRVSWIQGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VP YYRDPNKP YKKVLRTR Y I QTELG SRERLLVAVLTS RATLSTL 
AVAVNRTVAHHFPRLLYPTGQRG ARAPAGMQVVSHGDER PAWLM 
S ETLRHLHTHFGADYDWFFIMQDDTYVQAPRLAALAGHLS INQD 
LYLGRAEEF I GAGEQAR YCHGGFG YLLSRSLLLRLRPHLDGCRG 
D I LSARPDE WLG RCL I DS LG VG C VSQHQG QO YR S FELAKNRDPE 
KEGSSAFLSAFAVHPVSEGTLMYRIiHKRFSALELBRAYSEIEQL 
QAQ IRNLTVLTPEGEAGLSWPVGL PAPFTFHSRFEVLGWDYFTE 
QHT FS C ADG AP KC PLQGAS RADVG D ALETAL E QLNRRYQ PRLRF 
QKQRLLNGYRRFDPARGl^YTLDLLLECVTQRGHRRALARRVSL 
LRPLSRVE1LPMPYVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRIAWLAVRAEAPSQVRLMDWS KKHPVDTLFFLTT VWTRPG 
PE VLNRCRMNAI SG WQAFFP VHFQEFNPALSPQRSPPGPPGAGP 
DP PS P PGADPSRGAPIGGRFDRQAS AEGCFYNADYLAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CS PRLS EELYHRCRLSNLEGLGGRAQliAMALFEQEQAKST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGLRAAAGAFRTGS PLALGPETPQVACLP 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGS ISRAEAEEHLKLAGM 
ADGLFLLRQCLRSLGG YVLSLVHDVRFHHFP I ERQLNGTYAIAG 
GKAHCG PAELCE FYSRDPDGLPCNIiRKPCNRPSGLEPOPGVFDC 
LRDAMVRD YVRQT WKLEGE ALE QAI I SQAPQ VEKXi I ATTAHERM 
PWYHSSLTREEAERKLYSGAQTDGKFI>LRPRKEQGTYAIiSLIYG 
KTVYHYLISQDKAGKYC I PEGTKFDTLWQLVE YL KLKADGL I YC 
LKE ACPNSSASNASGAAAPTLPAH PSTLTHPQRRI DTLNS DGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
L IAD I E LG CGN FG S VRQG VYRM RKKQ I D VAI KVLKQGTE KADTE 

EMMREIAQIMHQLDNPYIVRL I GVCQAEALMLVMEMAGGGPLHKF 

T.Urtin>T?T?T wrcwvawT.T.tirtTrcMnMWT.WTJifKTisiftior^T 7s t\ n vn n r 
Lt vvj Jvrcr» r. icvoW V/vd uLulU V o n\jm J\ i xjctttVXt c v HKJJuAAitN VLtU 

VNRHYAKISDFGLSKALGADDSYY-XARSAGK^ELKWYAPECINF 

RKFSSRSDVWSYGVTMWEALS YGQKPYKKMKGPBVMAFI BQGKR 

MECP PECP PELYALMSDCW I YKWEDRPD FLTVEQRMRACY YS LA 

SKVEGPPGSTQKAEAACA 


ODUU 




726 


MLSESS S FLKGVMLGS IFCAL I TMLGHI R XGHGMRMHHHEHHHIi 
QAPNKED ILKISEDERMELSKS FRVYCI ILVKPKDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFES INMDTNDMWLMMRKAYKYAFDK 
YRDQYNWFFLARPTTFAIIENLKYFIVLKKDPSQPFYLGHTIKSG 
DLEYVGMEGGIVLS V3SMKRLNS LLN X PKKC PEQGGM I WK I S ED 
KQLAVCLKYAGVFAENAEDADGKDVFNTKSVGLSIKEAMTYHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


6 SOI 


1 


570 


LVGMSGGGrETPVGCEAAPGGGSKKRDSLGTAGSAHLI IKDLGE 
IHSRXiLDHRPVIQGETRYFVKEFEEKRGLREMRVLBNLKNMIHE 
TNEHTLPKCRDTMRDSLSQVLQRLQAANDSVCRLQQREQERKKI 
HSDHLVASEKQHMLQWDNFNKEQPNKRAEVDEEHRKAMERLKEQ 
YAEMEKDLAKFSTF 


6502 


213 




AGNKPDP WAGRNRTAVLPDVS VFHREDVGWWRSWLQQ S YQAVKE 
KSSEALE FMKRDLTE FTQ WQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSG? 
AE P YDGTKARLYS LQS DPAT YCNE PDGP PEL FDAWLSQFCLEE K 
KGEISELLVGS PS I RALYTKMVPAAVSHS EFWHR YFYKVHQLEQ . 
EQARRDALKQRAEQS I SEEPGWEEEEEBLMG I S P IS PKEAKVP V 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


^kmino acid segment containinq sicmal nentjHp 
(A^Alanine, C«Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
K-Hietidine, I=Isoleucine, K=Lysine, 
L=Leucine, K=Methionine, N=Asparagine, 
P= Proline, Q^Glut amine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
w^Tryptophan, Y=Tyrosine, X =UnJcnown, *«stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGH'IXjGPEPRPPARVETIiREEAPTDLRVFBIiNSDSG 
KSTPSNNG KKGSS TDI SEDWEI<DFDLDMTEEE VQMALS KVDAS G 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KS S EALE FMKRDLTE PTQWQHDTACT I AATA3 WKE KLATEGS 
SGATEKMKKGLSDPLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGE I SELLVGSPS IRALYTKMVPAAVSHSEFPWR YF YKVHQLEQ 
EQARRDALKQRAEQS I SEEPGWEEEEEELMG I SPISPKEAKVPV 
AKI STFFEGEPGPQSPCEENLVTSVEPPAEVTPSESSES ISLVT 
Q IAN PATAP EAR VLPKDLSQKLLEAS LE EQG LAVDVGETG PS P P 
IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNG KKGSSTDI SEDWEKDFDLDMTBEE VQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


: 6504 


2131 


1294 - 


GKVCLVAHWVCLSILSPPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSOKKORGRPSSOPCRNT Vnrp T QMfiuivwr'nr'tj tty-» 

WKGTVLDQVP1NPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DR VAS S H ISDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGWDGLI GKHVE YTKEDGSKR I GMVIHQVEAKPSVYFI KF 
DDDFHIYVYDIiVKKS 


6505 


2131 - 


1294 


GKVCLVAHWVCLSILS PPPAGMKTPNAQEAEGQQTRAAAGRATG '" 
S ANMTKKKVS QKXQRGRPSSOPCRNIVGCR I SHGW KEGDEP I TQ 
WKGTVLDQVP INPSLYLVKYDGIDCV YGliELHRDERVLS LKILS 
DRVAS SHI SDANLANTI IGKAVEHM FEGEHGS KDEWRGMVLAQA 
P1MKAWFY ITYEXD PVLYMYQIiLDD YKEGDLRIM PESSES P PTE 
REPGG WDGL IGXHVE YTKEDGSKRIGMVIHQVEAKPS VY F I KF 
DDDFHIYVYDLVKKS 


6506 ' 


1 


1350 


EVSPPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGST 
EIiVBDSHYSQSQLVCSDCGCVVTEGVIjTTTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDIiCRVLQLPPTFEDTAVAYYQQAY 
RHSG IRAARLQKKEVLVGCCVLITCRQHNWPLTMGAI ctllyad 

ldvpsstymqi vkllgldvpslciaelvktycs s fklfqas psv 
pakyvedicekmlsrtmqlvelanetwlvtgrhplpvitaatfla 
wqslqpadrlscslarfcklanvdlpypassrlqellavllrma 
eqlawlrvlrldkrswkhigdllqhrqslvrsafrdgtabvet 

REKEPPGWGQGQGEGBVGNNSLGLPQGKRPASPALIiLPPCWIiKS 

pkricpvppvstvtgdenisdseieqylrtpqevrdfqraqaar 

QAATSVPNPP 


6507 


1878 


929 


RSH^RLPELPSGCLVI^VQELVQMSGMEATVTiPIWQNKPHGA^ 
ARSVVRRIGTNLPLK?CARASFETLPNISD^CIiRDVpPVPTI«AD 
I AWI AADE E ETYAR VRS DTR PLRHTWKP S PLI VMQRNAS VPNLR 

GSEERLIJ^KKPALPAl^RTTEIjQDELSHLRSQIAKIVAADAAS 
ASLTFDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITBETEVE 
VPELPSVPLLCSASPECC3CPEHKAACSSSEEDDCVSLSKASSFA 
DMMGILKDFHRMKQSQDLNRSLLKEEDPAVLISEVLRRKFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRW^SERREVRVPPPHLQRGRSGLEPGTFRKMAAARP 
S WRVLPGSS VLFLCDMQEKFRHNI AYFPQI VSVAAR^KNTTIi 
DLLDRGLQVHVWDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
IliQLVGDAVHPQFKEIQKLIKEPAPDSGLLGLFCJGQNSliliH 


6509 " " 


2 


1053 


F VWNPRGGR1CRRRQ AAVTQAATRASGTPS PRDGTMTQGKIiSVAN 
KAPGTEGQQQVHGEKKEAPAVPSAPPSYEEATSGEGMKAGAFPP 
APTAVPLHP S WAYVDPSSS SS YDNGFPTGDHBL FTTPS WDDQKV 
RRVFWKVYTILLIQLLVTLAWALFTFCDPVKDYVQANPGWYW 
ASYAVFFATYLTLACCSGPRRHFPWNLILLTVFTLSMAYLTGML 
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SEQ 
ID 
NO: 


Predicted 

jjtsy j, lining 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, t=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P^Proline, Q^Glutamine, R«Arginine, 
S^Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y»Tyrosine, X«Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








SS YYNTTS VLLCLGITALVCLS VTVFS FQTKFDFTSCQGVLFVL 
LMTLFFSGL I LA I LLPFQ YVP WLHAVYAALGAG VTTLFLALDTQ 
LLMGNRRHSIiSPEEYI FGALNI YLDI I YIFTFFLQLFGTNRE 


6510 


37 


1156 


PCALDGCPQRGAVH PLLSSAMGLijAFLKTQ FVLHLLVG FVF WS 
GIjVINFVQLCTLALWPVSKQLYRRLNCRLAYSLWSQLVMLLEWW 

sctbctlftdqatverfgkehaviiiinhn'feidflcgwtmcerf 
gvlgsskvlakkellyvpligwtwyfleivfckrkweedrdtvv 
eglrrlsdypeymwfllycegtrftetkhrvsmevaaakglpvl 
kyhllprtkgfttavkclrgtvaavydvtlnfrgnknpslusil 
ygkkyeadmcvrrfpledi pldekeaaqwlhklyqekdalqe I Y 
nqkgmfpgeqfkparrpwtllnflswatillsplfsfvlgvfas 
gsplliltflgfvgagnghcr 


5511 


2541 


1425 


geeqplaaaptecleqviggagdpgtwasfpsplpgpapi,kggk 

TMATNFSDIVKQGYVKMKSRKIiGIYRRCWLVFRKSSSKGPQRLE 
1 KYPD2KS VCLRGCPKVTEISNVKCVTRLPKETKRQAVAI I FTDD 
! SARTFTCDSELEAEEWYKTLSVECLGSRIiNDISLGEPDLLAPGV 
QCEQTDR FNVFLIjPCPNIiDVYGE CKLQI THEUI YLWD I HMPR VK 
L VS WPLCSLRR YGRDATRFTFEAGRMCDAGEGL YTFQTQEGEQ I 
YQRVHSATLAIAEQKKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
MLPRSAYWHHITGSQNIAEASSYAGEGYGAAQASSETDLLKRFI 
LLKPKPSQGDSSEAKTPSQ 


6512 


159 


807 


FGKKSTWFPI^RSLRVASGRSCKLGHdGYTGSGPGFGEPRJDSCSA 
EVPSGSGRATGCERGGVRGARQGRAPGSS I WRKEPRMVCTRKTK 
TLVSTCVILSGMTNI I CLLYVGWVTNYIASVYVRGQEPAPDKKL 
EEDKGDTLKIIERIjDHLENVIKQHIQBAPAKPEEAEAEPFTDSS 
LFAHWGQEL S PEGRRVALKQFQ YYG YNAYLSDRLPLDRP 


6513 


2 


756 


FV S PEPGFSLAQLNL I WQLTDT KQLVHSFAEGQDQGSAYANRTA 
LFPDLLAQGNASLRLQRVRVADEGSFTCFVS I RD FGS AAVS LiQV 
AAP YS KPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGa 
GVPLTGNVTTSQMANEQGIiFDVHS ILRVVLGANGT YSCLVRNPV 
LQQDAHS S VTITPQRS PTGAVEVQVPEDP WALVGTDATLRCS F 
SPE PG FS LAQLNL I WQLTDTKQLVHS FAEGQDQGS AYANRTAZiF 
PD LLAQGNAS LRLQRVR VADEGS FTCFVS IRD FGS AAVS LQVAA 
PYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVPWQDGQGV 
PLTGNVTTSQMANEQGLFDVHSILRVVIiGANGTYSCLVRNPVljQ 
QDAHSSVTlTPQRSPTGAVBVftVPBDPWALVGTDATLRCSFSP 
EPGFSIiAQLNLIWQLTDTRQLVHSFTEGR 


6514 


985 


302 


VGIPGPTISSAAEMEDLIiDIOEELR YSIATSRAkfaGRRAQQ^SA ' '" 
QAENHLNGKNSSLTLTGETSSAKLPRCRQGGWAGDSVKASKFRR 
KASEEI EDFRLRPQSLNGSDYGGDI PI I PDLEEVQEEDFVLQVA 
APPS IQ IXRVMT YRDLDNDLMKYSAIQTLDGE1 DLKLLTKVLAP 

EHEVRERNPSWQDDVGWDWDHLFTEVSSEVLTEWDPLQTEKEDP 
AGQARHT 


6515 


1345 


305 * * 


GRVGSRRRGAAVPGGCGAGSTQLEVSAS ASBGALCy^ADMNP IW" 

riiuwwur J.O(U/AAJCiKvinUwl1VlUUii Via X vXJjIUiUUbAVOAVEG 

AWALEDD PEFNAGCX3SVLNTNGEVEMDAS IMDGKDLSAGAVS A 
VQCIANPIKLARLVMEaCTPHCPLTDQGAAQFAAAMGVPEIPGEK 
LVTERNKKRLEKEKHSKGAQKTDCQKNliGTVGAVALDCKGNVAY 
AT5TGG I VNKMVGRVGDS PCLGAGGYADNDIGAVSTTGHGES I L 
KVNLARLTLFHIEQGXT VE EAADLSLGYMKSRVKGIjGGLI WS K 
TGDWVAKWTSTSMPWAAAKDGKLHFGIDPDDTTITDLP 


6516 


1 


1402 


frrlrylgqdataaardlrtrglqgycpsatarqqvlVsalqql 

KGRRSBHRNENQEMPYSTNKEIiILGIMVGTAGISLLLLWYHKVR 

kpgi amklpeflsiigntfns itlqdeihddqgttvi fqerqlq i 
leklnelltnmeelkeeirflkeaipkleeyiqdelggkitvhk 
ispqhrarkrriiptiqssatsnsseeaeseggyitantdteeqs 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine r 
H«Histidine, I=Ieoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*Asparagine, 
P«Proline, Q=Glut amine, RsArginine, 
Soserine, T=Threonine, V=Valine, 
W«Tryptophan, Y«Tyrosine, X-Unknovn, **stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








fpvpkafntrveelnldvllqkvdhlrmsesgksespeLLrdhk 

E KFRDE I E FMWRFARAYGDMYELSTNTQBKKHYAN IGKTLS ERA 
INRAPMNGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKKHLDIA 
IKLLPEEPFLyYLKGRyCYTVSKLSWlEKKMAATLFGKIFSSTV 
QEALHNFliKAEELCPGySNPNVMYIAKCYTDLEENQNAIiKFCNIi 
ALLLPTVTKEDKEAQKEMQK IMTSLKR 


6517 


3 


1414 


GRVWGGS S S IjNAM VYVRGHAEDYERWQRQGARGWD YAHCLP YFR 
KAQGH ELGAS R Y RG ADGP LR VS RGKTNH P LH CA FLEATQ QAG YP 
LTEDMNGFQQEG FGWMDMT I H EGKRWS AACAYLHPALS RTNLKA 
EAErLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVILSGGAIN 
S PQLLMLSG IGNADDLKKLG I P WCHLPGVGQNLQDHLEI YI QQ 
ACTRPITLHSAQKPLRKVCIGLEWLWKFTGEGATAHLBTGGFIR 
S QPG VPHPDI Q FHFLPS QVI DHGRVPTQQEAYQVHVGPMRGTS V 
GWLKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKIiTREIFAQE 
ALAPTOGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAVVDPO/TRVLGVENLRVVDASIMPSMVSGNLNAPTIMIA 
BKAADI I KGQPALWDKDVPVYKPRTLATQR 


6518 


242 


1098 


PAWNPGSBPRTRVRPRARSFPtpppRAPRRRRHRLLRAVPGPSR 
RHRCRRRAPPPPSTMGDAGSBRSKAPSLPPRCPCGFWG5SKTKN 
hCS KCFADFQKKQPDDDS APSTSNSQSDLFSEETTSDNNNTS IT 
TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVSLITPTKRSC 
GTDS QSENEAS P VKRPRLLENTERS EETS RS KQ KSRRR C FQ CQT 
KIiELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KM VKLDRKVGRS CQR I GEGCS 


6519 


3 


1113 


BRKMAKPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AXKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAA PS PQS YGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLIiLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGAS REENGEVKPLPRDKI KDK I KBRDKE KER K KK KHK 
VMNEIKXENGEVKILLKSGKEKPKTNIEDLQiraCVKKKKIQCKHK 
ENEKRKR P KM YS KS IQTI CSGLLTDVEDQAAKG I LNDN I XD YVG 
KNLDTKN YDS KI PENSE FPFVSLKEPRVQNNLKRLDTLEFKQLI 
KIEHGPNGGASVIHCLQ 


6520 


3 


1113 


ERKMAEPPSPVHCVAAAAPrATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGBGGSGGNSRQLQPPAAPSPQSYGSPAS 
HSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLIi 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKBKRE KERRRHGL 
GGAREAGGASREENGEVKPLPRDKI KDKIKERDKEKEREKKKHK 
VMNEIKKKNGEVKILLKSGKEKPKTNIEDLQI^CKVXKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGIitiTDVEDOAAKGIIJBIDNIKDYVG 
KNLDTKNYDSKI PENSEFP FVSLKEPRVQNNLKRLDTLEFKQLI 
HIEHQPNGGASVIHCLQ 


6521 
SS22 


184 
1042 


1798 
391 


Kli^ATDTSC^ELVHPKAIiPLrVGAQLIHADKLGEKVSDSTMP 
I RRTVNS TRETP PKS KLABGE EE KPBPD I SS EES VSTVEEQENE 
i e trn ioo ctf\a\ieis\jxu tr aSiatt Iv±i£Nivb S EETaKDEKDQS KEKEKK 
VKKTIPSWATLSASQLARAQKQTPMASSPRPKMDAILTEAIXAC 
PQKSGASWAIRKY 1 1 HKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEASYSLIRKYVSQYYPKLRVDIRPQLLKKA 
LQRAVERGQLEQITGKGASGTFOIJCKSGEKPLLGGSLMEYAILS 
AIAAMNBPKTCSTTALKKYVLENHPGTNSNYQMHLLKKTLQKCB 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDB 
DEDESSEEDSEDEEPPPKRRLQICKTPAKSPGKAASVKQRGSKPA 
PKVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 

NKWI^PSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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SEQ " 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 nrat "t on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, CoCysteine, D*Aspartic Acid, E*= 
ijiucaraic Acia, r=»pnenyialanine, G«Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L=* Leucine, M^Methionine, N=Asparagine , 
P» Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








ECLDYYGMLSLHRMFEVVGGQLTECELELLAFIjLDEAPGAAGGL " 
aXiUColjXjRUljUaJLjaRRGQCDESNLRL 

RKRRRPVSPERYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRFSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


5523 


2 


1697 


ASCQTRRRTAALDSGERIAGRRSPIALAMASNFNDIVKQGYVKI " 
RSRKLGI FRRCWLVFKKASS KGPRRLBKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI I FHDBTSKTFACESELEAEEWC 
KHLCMECLGTR LND I S LGE PDLLAAGVQREQNERPNVYLMPTPN 
LD I YGECTMQ I THEN I YLWD IHNAKVKLVMWP LSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
IiMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6524 


2 


1097 


ASCQTRRRTAALDSGERIAGRRS P IALAMASNFNDIVKQGYVKI 
RS R KLG I FRRCWLVFKKASSKGP RRLEKF P DEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI I FHDBTSKTFACESELEAEEWC 
KHLCMECLGTRLNDISLGEFDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQI THEN I YLWD I HNAKVXLVMWPLSSLRR YGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTS LTE PMTLS KS I S LPRS A Y WHH I TRQNS VGE 
IYSIiQGNHENRHSDLTGKSCKTSEN^PLEENAPLVMYGITHHLP 
MDTSTCKWHDLE 


"" 6525 


1 


1859 


GE S PF5EE E S I EFNPS S SGRSART VS SNS FCSDDTGWPS SQSVS " 
PVKTPSDAGNSP1GFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 
KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 
NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPFNP 
EQYLTPLCXJKEVTVRHLKTKLKESERRIiHERESEIVELKSQLAR 
MREDWIEEECHRVEAQLALKEARKEIKQLKQVIETMRSSLADKD 
KG I QKYFVD INIQNKKLES LLQSMEMAHSGSLRDELCLDFPCDS 
PEKSLTLNPPLDTMADGLSLEEQVTGEGADRELLVGDSIANSTD 
LFDE I VTATTTESGDLELVHSTPGANVLELLPI VMGQEEGSVW 
ERAVQTDWP YS PAI SE L I QS VLQKLQDPCP S S LAS PDES EPDS 
MES F PESLSAL WDLTPRNPNSAI LLS P VETP YANVDAE VHANR 
LMRE LD FAACVE ERLDG V I P LARGG WRQY W S S S FLVDLLAVAA 
P WPTVLWAFS TQRGGTDP VYNIGALLRGCCVVALHSLRRTAFR 
IKT ; 


6526 


2 


2034 


SGRAGEPEEWRGRQI IDS KETWI P FNS EDSQQLE EAYSSGKGCN 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVP YS ES FS Q VLEETYMLAVTLDE WKKKLES PNRE 1 1 1 LHNP 
KLMVHYQ P VAGS DDWGST PMEQGR PRT VKRGVENI S VDI HCGEP 
LAj 1 utiuv r Wxio-LLri'ALDLiKcRS I VQ CVNDFRS VS LNLLQTHFK 
KAQENQQIGRVE FLPVN WHS PLHS TGVDVDLQRITLPS I NRLRH 
FTNDTILDVFFYNSPTYCQTI VDTVAS EMNRI YTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSEKGSLNIVMDQGD 
TPTLEEDLKKLQLSEFFDIFEKEKVDKEALALCTDRDLQEIGIP 
LGPRKKILNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 
TRNGDYLDVGIGQVSVKYPRLI YKPEI FFAFGSP IGMFLTVRGL 
KRIDPNYRPPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKGRKRMHLELREGLTRMSMDLKNNLLGSLRMAWKSFTRAPY 
PALQASETPEETEAEPESTS EKPSDVNTEETSVAVKEEVLP INV 
GMLNGGQRIDYVLQEKPIESFNEYLFALQSHLCYWESEDTVLLV 
LKEIYQTQGIFLDQPLQ 


6527 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRLDTTLIDFTDMkCQRGDLS 
FI FNGDAAPSES FVVLDNEQKVYQRIHHEESEMETEEE VDILMS 
SDI YSATLSTKS I S FTRAQTG WLFREDKTERVGNFLADFYLVNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=*Cysteine, D^Aspartic Acid, B= 
Glutamic Acid, F« Phenylalanine, G-Glycine, 
H^Histidine, I«Isoleucine, K«Lysine, 
L- Leucine, M»Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y»Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 1 
\=possible nucleotide insertion) 








lvlesrkrrehlseedilrnkaimeslskggnimeqnfepirrq"™ 

SLTPPPQNTITWEEYISAENGKAPHLGRELVCKESKKTFKATIA 
MSQEFPLGIELLLNVLBVVAPFKHFNKIiREFVQMKLPPGFPViCL 
D I PVF PTITATVTFQE FRYDS FDGS I FTIPDD YKEDPSRFPDL 


£528 


1 


1073 


LTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKI LFCVLGLV lA 
I P FL I KLCPG I QAKLI FLNFWVP Y FIDWCKPQDQGLNHTCNY Y 
LQPEED\rriGWHTVPAVWM;<NAQGia)QMWYEDALASSHPI ILY 
LHGNAGTRGGDHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE 
RGMTYDALHVFDWIKARSGDNPVYIWGHSLGTGVATNLVRRLCE 
RETPPDALILESPFTNIRBEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSGIKFANDENVKHISCPLLILHAEDDPWPFQLGRKLYS IAA 
PARSFRDFKVQFVPFHSDLGYRHKYIYKSPELPR ILREFLGKSE 
PEHQH 


6S29 


363 1 ' 


2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF ' * 
EGVEDESFLKWFCGNVNEQNVLSERELEAFSILQKSGKPILEGA 

ZXT.FlT? AT-VTr 1 VTQDT'. VTDOT FlFlVET DVT CinDtfATT T VT mrr vr/\ts 

niJuat\Ltis.i \,jvioi/ um ^KijiJUAiiijexUjrjUrSvy IIjJjjKJjIunJjKIQR 
RNKCQLMASVTSHKSLRLNAKEEEATKKLKQSQGILNAMITKIS 
NELQAIiTDEVTQLMMFFRHSNLGQGTNPLVFLSQFSLEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFIjKIQTPSICD 

nqei leerrlemarlqlay icaqhqlihlkasnssmpcss ikwae 
eslhsltskavdkenldakissltseimklekevtqikdrslpa 
wrenaqllkmpwkgdfdlqiakqdyytarqelvlnqlikqka 

SFELLQLSYEIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 
TDPSVSQQINPRNT I DTKDYS THRL YQ VLEGEN KKKE LFLTHGN 
LEE VAE KLKQNI SLVQDQLAVS AQEHS FFLS KRNKD VDMLCDTL 
YQGGNQLLLSI^ELTEQFKKVESQIJ^KLNHLLTDILADVKTKRK 
TLANNKLHQMERE FYVYFLKDEDYLKD I VENLETQS KI KAVS LE 
D 


6530 


128 


2986 


G AAHHG AI VQ VHPLLPGS ST I M I HDLCLVFPAPAKAWYVSD I Q "~ 
ELY I RVVDKVB IGKTVKAYVR VLDLHKKPFLAKYFP FMDLKLRA 
AS PI ITLVALDEALDNYTITFLI RGVAIGQTSLTAS VTNKAGQR 
I NS APQQ I EVFPP FRLMPRKVTLLIGATMQVTSEGG PQPQSNI L 
FS I SNES VALVS AAGL VQGIAI GNGTVS GLVQAVD AETGKWI I 
SQDLVQVEVLLLRAVRIRAP IMRMRTGTQMPIYVTGI TNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEAS IRLPSQYNFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDE1 QVQVFEKLQ 
LLNPE I EAEQILMS PNS YI KLQTNRDGAASLS YRVLDG PEKVP V 
VHVDEKG FLASGSM IGTSTI EVT AQEPFGAKTQTI I VAVKVS PVS 
YLRVSMS P VLHTQNKEALVAVPLGMTVTFTVHFHDNS GDVFHAH 
S S VLNFATNRDD FVQ IGKGPTNNTCWRTVSVGLTLLRVWDAKH 
PGLSDFMPLPVLQAISPELSGAMWGDVLCLATVLTSLEGLSGT 
WSSSANS ILHIDPKTGVAVARAVGS VTVYYEVAGHLRTYKE VW 
S VPQRI MARHLHP IQTSFQBATASKVIVAVGDRSSNLRGECTPT 
QREVI QALHPETL I S OQSQFKPAVFD PPSQDVFTVEPQ FDTALG 
QYFCS ITMKRLTDKQRKHLS MKKTALWSAS LSSSHFSTEQVGA 
EVPFSPGLFADQAEILLSNHYTSSEIRVFGAPEVLENLEVKSG3 
PAVLA FAKEKSFGWP3 F ITYTVGVLDPAAGSQGPLSTTLXFS 3 P 
VTNQAI AI PVTVAFWDRRGPG P YGASLFQHFLDS YQVMFFTL F 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
SPTSPNALPPARKASPPSGLWSPAYASH 


6*31 


845 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
SLCMVITIYYDVKVRRIVRGCGQYISYRCQEKRNTYFAEYWYQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLS LPLPNFHAGTEPDGLDPMVTLS LNLGLS FAELRRM YLFLNS 
SGLLVLP QAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDSLFPE^PGPAPQVttGPQGPGLlkGVAPPTL" 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C«Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G^Glycine, 
H=Histidine, I*Isoleucine, JO=Lysine t 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, ReArginine, 
S=Serine, T=Threonine, V=» Valine, 
W*Tryptophan, Y=Tyrosine, X=»Unknown, *«Stop 
Codon, /=poesible nucleotide deletion, 
X^possible nucleotide insertion) 








ITDSTGTHIiVLTVTN KNAH8 PGLSRGS PQQP S SQ PGSPAPAPSA 
QMDLEHPIjQPLFGTPTSLLKKBPPGYBEAMSQQPKQQENGSSSQ 
QMDDLFDIL1QSGEISADFKEPPSLPGKEKPSPKTVCWSPLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLBDFLESSTGLPLLTSGHDGP 
EPLSLIDDLHSQMLSSTAIliDHPPSPMDTSELHPVPEPSSTMGL 
DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLPSTDFLDGHD 
LQLHWDSCL 


6533 


1798 


373 


STISWLARVEPPRRSSGVGAARUiFPGGSRPIJUVIUCVIiAliAVli 
ALLERNNADSMS AHSMLCER T AIAKEL I KRAESLS RS RKGG I EG 
GAKLCSKLKAELKFLQKVEAGKVAIKESHLQSTNLTHLRAIVES 
AENLEEWSVLHVFGYTDTLGEKQTLVVDVVANGGHTWVKAIGR 
KAEALHNI WLGR G Q YGDKS £ IEQAEDFIiQASHQQPVQ YSNPH 1 1 
FAFYNSVSSPMAEKLKEMG1SVRGDIVAVNALLDHPEELQPSES 
ESDDEGPELLOVTRVDRENI LASVAF PTBI KVDVCKR VNLDI TT 
LITyVSALSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQSILDTLGGPGERERATVLIKRINVVPDQPS 
ERALRLVASS KINS RSLTI FGTGDTLKAITMTANSGFVRAANNQ 
GVKFSVFIHQPRAI.TESKEAIATPIiPKDYTTDSEH 


6534 


47 


596 


KATRFISAAFVVtNkQGVsPAkLPHTSWSWSLQTLSFLFSGDLA 
EKSLQCFPCSAMLLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYSYGATPYIiFWMERTVEEAFILLVCLKPWRVASS 
IiEKKEKEDESFQLLLGSRYNVLKAHCIiLPLIRWLTSGDSIiLSAQ 
PHCPQGL 


653S 


250 


964 


LHCrFFRDVAIQRDL,LPKBKNLETLLTLAFbEIDKAFS^]tiARLS 
ADATLLTSGTTATVALLRDG I ELVVASVGDSRAILCRKGKPMKL 
TIDHTPERKDEKERI KKCGGFVAWNSLGQPHVNGRLAMTRS IGD 
LDLKTSGVI AEPETKRIKLHHADOSFLVLTTDG INFM VNSQEI W 
DFVNQCHDPNEAAHAVTEQAIQYGTEDNSTAVVVPFGAWGKYKN 
SEINFS FSRS FASSGR WA 


6536 


242 


1174 


S LVKEMTNQYG I LFKQEQAHDDAI WS VAWGTNKKENS ET VVTGS 
LDDLVKVWKWRDERLDLQWSLEGHQLGWSVDISHTLPIAASSS 
LDAH IRLWDLENGKQ I KS I DAGPVDAWTLAFSPDSQ YLATGTHV 
GKVNI FGVE SGKKE YS LDTRGKFILS I AYS PDGKYLA SG A I DG I 
INIFDIATGKLLHTLEGHAMPIRSLTFSPDSQIJiVTASDDGYIK 
IYT}VQHANIAGTLSGHASWVLNVAFCPDDTHFVSSSSDKSVKVW 
DVGTRTCVHTFFDHQDQVWGVKYNGNGSKIVSVGDDQEIHIYDC 
PI 


6537 


1638 


921 


NRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLtDRV 
FTTYKIj^THQTVDFVRSKHAQFGGFSYKKMTVMEAVDLLDGLV 
DESDPDVDFPNS FHAFQTAEG I RKAHPDKDWFHLVGLLHDLGKV 
JjrfUji?\jii^ywAVVt3lJ^rirPVGCRPQASVVFCDSTFy 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVP AGDTLS PQSTCTR 


6538 


3345 


2412 


PYLYDFLDALITCQTAPEEAFIKIjDGLAGMIjTEQIjRRLTKQVQE 
ARHNRDDE A I KKAVNE YDETMEK YX P VLMAQAKI YWNLENYPMV 
EKIFRKSVEFCOTDHDVWKLNVAHVLFMQBNKYKEAIGFYEPIVK 
KH YDMILNVS AI VLANLCVS YIMTSQNEKAEELMRKI EKEEEQL 
S YDDPNRKM YHLC I VNLVTGTIi YCAKGNYE FGIS RVI KS LEPYN 
KKLGTDTWYYAKRCFLSIiLENMSKHMIVIHDSVrQECVQFLGHC 
ELYGTNI PAVI EQPLEEERMHVGICNTVTDESRQLKAL I YE I IGW 
NK 


6539 


218 


339 


FLGAASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV 


6540 


3 


391 


I^RLWLLLLRRPEDAMAECPTLGEAVTDHPDRLWAWEKFVYLDE 
KQHAWIiPLT I E I KDRLQLRVLLRREDVVIiGRPMTPTQ IGPSLLP 
IMWQLYPDGRYRSSDSSFWRLVYHXKIDGVEDMLLELLPDD 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C^Cysteine, D=Aepartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Iaoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y*Tyroeine, X»u*nknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 


6541 


1165 


536 


RTLVQRR I LMLLRKPARGRDIiRGRGRGTPRGGRKGLLPTPDBFP 
RFEGGRKPDSWDGNREPGPGHEHFRDTPRPDHPPHDGHSPASRE 
RSSSLQGMDMASLPPRKRPWHDGPGTSBHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSIiDGEHHDGYttRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRyNEGHALYtiAFIARKEGT 
KRGFLSKKTABASRWHEKWFALYQNVLFYFEGEQSCRPAGMYLL 
EG CS CERT PAPPRAGAGQGGVRDALDKQ YY FTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIEREVLMQKYIHLVQIVET 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDI KK I KKVQSFMRGWLCRRKWKTI VQD YI CSPHAESMRKR 
NQIVPTMVEAESEYVHQLYILVNGFLRPLRMAASSKKPPISHDD 
VSSIFLNSETIMFLHEIFHQGLKARIANWPTIjUiADLFDILLPM 
LNIYQEFVRNHQYSLQVXANCKQNRDFDKLLKQYEANPACEGRM 
LETFLTYPMFQIPRYirTLHELIAHTPHEHVERKSl^FAKSKLE 
ELSRVMHDEVSDTENIRKNLAIERMIVEGCDILLDTSQTFIRQG 
S LI Q VPS VE RGKLS KVRLGSLS L KKEGERQC FLFTKH FLI CTRS 
SGGKUCLLKTGGVIiSLIDCTLIEEPnASDDDSKGSGQVFGHLDF 
KIWEPPDRAAFTVVLLAPSRQEKAAWMSDISQCVDNIRCNGLM 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YAS VERLLERLTDLRFLS IDFLNTFLHTYRI FTTAAVVLGKLSD 
I YKR PFTS I P VRSLELFFATSQNNRGEHLVDGKS PRLCRKFSS P 
PPLAVSRTSS PVRARKI*5LT5PLNSKIGALDLTTSSS PTTTTQS 
PAASPPPHTGQI PLDLSRGLSS PEQS PGTVEENVDNPRVDLCNK 
LKRS I QKAVLB5APADRAGVESS PAADTTELS P CRS PST PRHLR 
YRQPGGQTADNAHCS VS PASAFAIATAAAGHGS PPGFNNTERTC 
DKEFI IRRTATKRVLNV1JIHWVSKHAQDFELNNELKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHLKIiEDI IQMTDC 
MKAE CFESIiS AMELAEQ I TLLDHVI FRS I PYEE FLGQGWMKLDK 
NERTP Y IM KTSQHFNDMSNLVASQ IMNYADVS SRANAIEKWVAV 
ADI CRCLHNYNGVIiE ITSALNRSAI YRLKKTWAKVS KQTKALMD 
KLQKTVSSEGRFKNLRETIiKNCNPPAVPYLGMYLTDLAFIEEGT 
PNFTE EGLVNFS KMRMI S H 1 1 RE I RQ FQQTS YR I DHQ PKVAQ Y L 
LDKDLI IDEDTLYELSLKI EPRLPA 


6543 


1857 


950 


FVSGCGRAG IGLS WAMAAEAR VSRWYFGGLAS CGAACCTH PXjDL 
LKVtttiQTQQEVKLRMTGMAIjRVVRTDG I LALYSGLSAS LCRQMT 
YSLTRFAIYETVRDRVAKGSQGPLPFHEKVIjLGSVSGIiAGGFVG 
TPADLVNVRWQNDVKLPQGQRRNYAHALDGLYRVAREEGLRRLF 
SGATMASSRGALVTVGQLS CYDQAKQLVLSTGYLSDNIFTHFVA 
S F 1 AGGCATFLCQPLDVLKTRLMNS KGE YQGVFKCAVETAKLGP 
LAFYKGLVPAG IRLIPHTVLTFVFLEQLRKNFGI KVPS 


6544 


630 


79 


PSPCF IRSRLtXSQPWMAGLEAWLSQNFSLHQPQSRVRVRRAS I S 
EPSDTDPEPRTLN PS PAGWFVQQHPELELMS SFRERFGRNWLQ Y 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQBEAR 

EGKQKECP 


6545 


176 


560 


PPHSHAALLPAAMTPLLTLILWLMGLPLAQALDCHVCAYNGDN 
CFNPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVPRCFETVYDGY 
S KHAS TTS COQ YDLCWGTGIATPATIiALAP IL LATL WGLL 


6546 


1657 


™ 364 


HLLNGLDE VAAFF VADLGAI VRKHFC FLKCbPRVRP F YAVKCNS 
SPGVLKVLAQLGLGFSCANKAEMEIiVQHIG IPASKI I CANPCKQ 
IAQIKYAAKHGIQLLSFDNEMELAKVVKSHPSAKMVLCIATDDS 
HS LS CLSLKFGVS LKS CRHLLENAKKHKVEWGVSFKIGSGCPD 
PQAYAQS I ADAR LVFEMGTELGHKMHVLDLGGGF PG TEGAKVR F 
EE I AS VINS ALDL YFP EGOGVD I FABLGR Y YVTSAFTVAVS 1 1 A 
KKEVLLDQPGREEENGSTSKTiVYHLDEGVYGIFNSVLFDNICP 
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corresponding 
to first 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
s eque nc e 


Amino acid segment containing signal peptide ' 
(A*Alanine, C=*Cysteine, D*Aspartic Acid, E=» 
Glutamic Acid, FoPhenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Lsucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine # R»Arginine, 
S= Serine, T=Threonine, V-Valine, 
M=»Tryptophan, Y»Tyrosine, X*Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=poasible nucleotide insertion) 








TPILQKKPSTEQPLYSSSLWGPAVDGCDCVASGLWIiPQLm/GDW 
LVFDNMG AYTVGMGS PFWGTQACH IT YAMSRVAWEALRRQLMAA 
EQBDDVEGVCKPLSCGWEITDTLCVGPVFTPASIM 


6547 


1 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCRQPPHALRPLLLLPLVLL 
PPLAAAAAGPNRCDTIYQGFAECLIRIGDSMGRGGELETICRSW 
KDFHACASQVLSGCPEEAAAW7ES LQQEARQAPRPNNLHTLCGA 
PVHVRERGTGS ETNQBTLRATAPALPMAPAPPLLAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG 
I KGCGITFTLGKGTE VGELKI LSRFQNA 


" 6549 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA" 
ARAF S KAP ARLSRPRAREEPPDPGRR YIQBE I IQARKHKLI KMC 
S S VAAKLWFLTDRRI RED YPQKE ILRALKAKCCEEELD FRAWM 
DEVVLTIEQGNLGLRINGELITAYPQVVVVRVPTPWVQSDSDIT 
VIJiHLEKMGCRLMNRPQAILNCVNKFWTFQELAGHGVPLPDTFS 
YGGH2NFAKM IDEAEVLEFPNfVVKNTRGHRGKAVFlARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGG VGMMCS LSEQGXQLAI QVSNI LGMDVCGI DLL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
RLTRRMSLLSWSTASETSEPELGPPASTAVDNMSASSSSVDSD 
PESTERELLTKLPGGLFNMNQLLANEIKLLVD 


6550 


2293 


922 


FRVSRIX^PDCXSIEQMGLAMEHGGSYARAGGSSR^CWYYLRYF^F 
LFVS L IQFL 1 1 LGLVLFMVYGNVH VS TESNLQATERRAEGLYSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDRINASFR 
QCQGDRVI YTNNQRYMAAI ILSBKQCRDQFKDMNKSCDALLFML 
NQKVKTLE VE I AKEKTI CTKDKBS VLLNKRVAEEQLVECVKTRE 
LQHQERQLAKEQLQKVQALCLPLDKDKFEMDLRNLWRDS I I PRS 
L DNLG YNL YH P LGSELAS I RRACDHMPS LMS S KVEE LARS LRAD 
IER VARENSDLQRQKLEAQQGLRAS QEAKQKVE KEAQAREAKLQ 
AECSRQTQLALEEKAVLRKERDNLAKELEEKKREAEQLRMELAI 
RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRKI 
LESQRPPAGIPVAPSSG 


6551 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPIjNKSSYKYE 
ADTVDLNWCVISDMEVIELNKCTSGQSFEV1LKPPSFDGVPEFN 

aslprrrdpsleeiqkkleaaebrrkyoeaellkhlabkreher 

eviqkaieennnfikmakeklaqkmes1tkenre 

ekdkhaeevrknkelkeeasr 


6552 


157 


748 


IQPPD PRNMTLAAYKE KMKELPLVS LFCS CFLADP LNKSS YKYE 
ADTVDUIWCVISDMEVIELETKCTSGQSFBVILKPPSFDGVPEFN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQBAELLKHLAEKREHER 

eviqkai eennnfi kmakeklaqkmesnkenreahlaamlerlq 
ekdkhaeevrknkelkeeasr 


65S3 


2 


1807 


FVWSKMAAHLSYGRVNLNVLREAVRREliREFI^kCAG^kAIVWD " 

eyltgpfgliaqysllkehevekmftlkgnrlpaadvkni I FFV 

*.*r*xujaui lui i/uw v uorjJKXXjif i KUr rlllir V PKKa LLiCEQRLKD 

lgvlgsfihreeysldlipfdgdllsmesegafkecylegdqts 
lyhaakglmtlqalygtipqi fgkgecarqvanmmirmkreftg 
sqnsifpvfdnlllldrnvdlltplatqltyeglideiygiqns 
yvklppekfapkkqgdggkdlpteakklqlnsaeblyaeirdkn 
fnavgs vlskkaki isaafeerhnaktvge i kqfvs qlphmqaa 
rgslanhtsiaelikdvttsedffdkltveqefmsgidtdkvnw 
yiedciaqkhslikvl^lvclqsvcnsglkqkvldyykreilqt 
ygyeh i ltlhnle kagllkpqtggrnnyptrrktlrlwmdd vne 
qnptd i s yvysgyapls vrlaqlls rpgwrs i eevlr i lpgphf 
eerqplptglqkkrqpgenrvtlifflggvtfaeiaalrflsql 
edggteyviattklmngtswi ealmekpf i 
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corresponding 
to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
<A«Alanine, C»Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S«Serine, T=Threonine, V*Valine, 
nsirypcopnan, x«iyrosine, X« unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVESGALhl/VIVGGGFGGIAAASQLQALNVPFMLVDM 
KDSFHHNVAALRASVETGFAKKTFISYSVTFKDNFRQGLVVGID 
LKNQMVLLG^EALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDMVRQVQRSRF I VVVGGGSAGVEMAAE 1 KTEYPBKEVTLIH 
SQVAIJU1KELLPSVRQEVKEILLRKGVQLLLSERVSNLEELPLN 
a zKlSx 1-K.vg i DKGTE VATNLVILCTG I KINSS AYRKAFES RLAS 
SGALRVNEIILQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 
AN I VNSVKQR PLOAYKPGALTFLLSMGRNDGVGQISGFYVGRIiM 
VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


498 


IHMALLRKINQVLL^iLIVTI^VliVKKVtiKGTVPKNnADDESE 
TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQ PLNFVRFYLPLL IHQHE KV I YLDDDVI VQGDIQE LYDTTLA 
LGHAAAFSDDCDLPSAQDINRLVGLQNTYMGYLDYRKKAI KDLG 
ISPSTCSFNPGVIVANMTEWKHQRITKQLEKWMQKNVEENIiYSS 
SIiGGGVATS pmli vfhgkystin PLWHI RHLGWN pdarys ehfl 

QEAKLLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLViiLPSLQ^PP^TFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNSG^^IJiVPVVFVTNAGNILQHSKAQELSALLG 
CEVDADQVILSHSPMKLFSEYHEKRMLVSGQGPVMENAQGIiGFR 
NVVTVDELRMAFPLLDMVDLERRLKTTPIjPRNDFPRIEGVLLLG 
EPVRWETSUJLIMDVI^NGSPGAGLATPPYPHLPVIiASNMDLIj 
WMAEAKMPRFGHGTFLLCLETIYQKVTGKELRYEGLMGKPS ilt 
YQYAE DL IRRQAERRGWAAP I R KL YAVGDNPMS DVYGANL FKQY 
LQKATHDGAPELGAGGTRQQQPSASQSCISILVCTGVYNPRNPQ 
STE P VLGGGEPP FHGHRDLCFS PGLMEASHWNDVNEAVQL VFR 
KEGWALE 


4557 


2598 


1534 


RMCGRTSCHIjPRDVLTRACAYQDRRGQQRIjPEWRDPDKYCPSYN " 
KSPQ3NSPVLLSRLHFEKDADSSERI IAPMRWGLVPSWFKESDP 
SKLQFNTTNCRSDTVT^EKRSFK^1^KGRRCVVIAIX3FYEWQRC 
QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLLT 
MAGIFDCWEPPEGGDVLYSYTI ITVDSCKGLSDIHHRMPAILDG 
E EA VS KWLD FGE VS TQEAL KL I H PTEN I TFHAVS S WNNS RNNT 
*iik~Li/U J VUJjVVKKEljRASGSSQRML 

ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ 


" *55B 


21 


1138 


r nwruxjU9Vjrcra^c.lAii> UiaiSfciVjlt &AAEE EGEPKVKKKRIjLCVEFAS 
VAS CDAAVAQC FLAE NDWEME RALNS Y FE PPVEESALERRPETI 
S EPKTYVDLTNEETTDSTTS K I S PS E DTQQENGS MFS L I TWNI D 
GLDI^LSERARGVCSYIALYSPDVIFLQEVIPPYYSYLKKR5S 
NYEIITGHEEGYFTAIMLKKSRVKLKSQBIIPFPSTKMMRNIiLC 
VHVNVSGNELC^TSHLESTRGHAAERMNQLKMVLKKMQEAPES 
ATVIFAGDTNLRDREVTRCX3GLPNNIVDVWEFLGKPKHCQYTWD 
TQMNSNLG I TAACKLR FDRI FFRAAAE BGH I I PRSLDLLGLEKL 
DCGRFPSDHWGLLCNLDI IL 


6559 


3 


364 


GPELSGLPTRPKKLkANQ^PiAMDCCASRSCSVPTGPATTICSS 
DKSCROGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPt^PTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCXPRGC 


6560 


3 


1435 


TATSGGIWLRJUCWRC^WPRPLPQSCVGTEGGLQVRDTSSRIAKG 
G VDHTKMS LHGASGGHERSRDRRRS SDRSRDSSHERTE S QLTPC 
IRNVTSPTRQHHVEREKDHSSSRPSSPRPQKASPNGSISSAGNS 
SRNSSQSSSDGSCKTAGEMVFVYENAKBGARNIRTSBRVTLIVD 
NTRFWDPS I FTAQ PNTMLGRMFG S GREHNFTRPNE KG E YEVAE 
GIGSTVFRAILDYYKTGIIRCPDGISIPELREACDYLCISFEYS 
TI KCRDLSALMHELSNDGARRQFEFYtiEEMILPLMVASAQSGER 
ECHIWLTDDDWDP7DEEYPPQMGEEYSQ1IYSTKLYRFFKYIE 
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beginning 
nucleotide 
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corre spondi ng 
to first 
amino acid 
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amino acid 
sequence 


Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H«Histidine, I«Iaoleucine, K=Lysine, 
L=Leucine, M*Methionine, NoAsparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T»Threonine, VeValine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\apos sible nucleotide insertion) 








NRD VAKS VLKERGL KK I RLG t EG YPTYKEKVKKR PGGRPB V I YN 
YVQRPFIRMSWE KEEGKSRHVDFQCVKS KS ITNLAAAAADI PQD 
QLVVMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGt»R(jpASSLi^PEPSPSWPSHS 
PCPMAALTDLSFM YRWFKNCNLVGNLSE KYVFITGCDSGPGNLL 
AKQLVDRGMQVLAACFTEEGSQKLQRDTSYRrjQTTLLDVTKSES 
I KAAAQV7VRDKVGEQGLWALVNNAGVGLPSGPNEWLTKDDFVKV 
INVNLVGLI2VTLHMLPMVKRARGRVVNMSSSGGRVAVIGGGYC 
VS KFG VEAFSDS I RREL YYFG VKVCI IE PGN YRTAILGKENLES 
RMRKLWERLPQETRDS YGEDYFRI YTDKLKNIMQVAR PRVRDVI 

NSMEHAIVSRSPRIRYNPGLDAKLLYIPLAKLPTPVTDFILSRY 
LPRPADSV 


6562 


1 


1562 


MSTLYDIRAHKAQLIJIFFA^SDSNKALEQRRTLHTPKLEHLDRV" 
LYEWFLGKRSEGVPVSGPMLIEKAKDFYEQMQLTEPCVFSGGWL 
WRFKARHG IKKLDASS EKQSADHQAAEQFCAFFRSIiAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGKDRLTVLMCANA 
TGS HRL K PLAIG K CS G P RAFKG I QHIiPVAYKAQGNAW VDKB I FS 
DWFHHIFVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAI FS VACAWNAVPSHVFRRAWRKLWPS VAFAEGSSSEB E 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGR PPAATS PAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAAVAFDAVLRFAERQPCFSAQEVGQLRALRAVFRSQQQVRRRR 
GALGAWKVEAIiQEG PGGCGATAQS PLP CSSTAGCN 


6563 


1319 


2694 - 


LARPAQPVLLRE PEGAGP P VPAGHLVHHLQGGHLRERAHP DtJSA 
HEHPLPCDQM FWRQNGGHliRMVE ANSRGWWG IGYDHTAWVYTG 
G YGGG CFQGLAS STSNI YTQSDVKCVHI YBNQRWNP VTGYTSRG 
LPTDRYMWS DASGLQECTKAGT KP PS LQWAWVSDWFVDFS VPGG 
TDQEGWQ YASDFPASYHGS KTMKD FVRRRCWARKC3CLVTSGPWL 
BVPPIALRDVS 1 1 PESPGABGSGHS IALWAVSDKGDVLCRLGVS 
ELN PAGSSWLHVGTDQP FAS I S I GACYQVWAVARDGSAF YRGSV 
YPSQPAGDCWYHIPSPPRQRLKQVSAGQTSVYALDENGNLWYRQ 
GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVIANKVQGSHSLS 
RGTVCHRTGVQPHEPKGHGWDYGIGGGP?DH ISVRANATRAPRSS 
SQEQEPSAPPEAHGPVCC 


6S64 


1 


9^ 


APGSCALWSYCGRGWSRAMRGCQLLGLRSSWPGDLLSARLLSQE 
KRAAETHFGFETVS EEEKGGKVYQ VFES VAKKYDVMNDMMSLG I 
HRVWKDLIXWKMHPLPGTQLLDVAGGTGDIAFRFUTYVQSQHQR 
KQKRQLRAQQNLS WEE I AKE YQNE EDSLGGS RWVCDINKEMIiK 
VGKQ KALAQG YRAGLAW VJOGDAEELP FDDDKFD IYTIAPGI RNV 
TH I DQALQEAHRVLKPGGRFLCLE FSQVNNPL IS RLYDLYS FQ V 
I PVLGE VI AGDWKS YQ YLVES I RRF P SQEEFKDM I EDAGFHKVT 
YES LTSG IVAIHS GFKL 


6565 


1464 


999 


RSAVANGLTKRRMGLKLNGRYISLIIiAVQIAYLVOJWRAAGKCD ' 
AVFKGFSDCLLKLGDSMANYPQGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKESKNLNIQGSLFELCX5SGNGAAGS 
LLPAFP VLLVS LSAALATWLS F 


6566 


3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG 

HVP PGCSQG LNP L YYNLCDRSGAWG I VLEAVAGAG I VTTFVLT I 

ILVASLPFVQDTKKRSLLGTQVFFLIiGTIiGLFCLVFACVEKPDF 

STCASRRFLFGVLFAICFS CLAAHVFALNFLARKNHGPRGWVT F 

TVALLLTL VEVI INTEWL 1 1 TLVRGS GEGGPQGNS S AGWAVAS P 

CMANMDFVMALIYVMIiIiLLGAFLG^PALOT 

LLTTATS VAI WWW I VM YTYGNKQHNS PTWDD PTLAI ALAANAW 

AFVLFYVIPEVSQVTKSSPBQSYQGDMYPTRGVGYETILKEQKG 

QSMFVENKAFSMDEPVAAKRPVSPYSGYNGQLLTSVYQPTEMAL 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Amino acid segment containing signal peptide"" 
(A-Alanine, C»Cysteine, DaAspartic Acid, E= 
Glutamic Acid, P» Phenylalanine, G=Glycine, 
H*»Histidine, I»Isoleucine, K=» Lysine, 
L»Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine, V=Valine, 
WaTryptophan, YaTyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








~MHKV?SEGAYD I ILPRATANSQVMG3 ANSTLRAEDM YS AQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


TKR5NLKAYACS IHHIRTMSYVFVNDSSQTNVPLIiQACIDGDFM 
YS KRLLESGFDPNIRDS RGRTGLHLAAARGNVD I OQLLHKFGAD 
LLATD YQGNTALHLCGH VDT1 QFLVSNGLKI D I GNHQGATPLVL 
AKRRG VNKDVIRLLESLEEQEVKGFNRGTHS KLETMQTAESESA 
MB SHS LLNPNLQQGEG VLSS FRTTWQE FVEDLGFWRVLLI* I PVI 
ALLSLGIAYYVSGVLPFVENQPELVH 


6568 


3 


1183 


HASDRLLVLPDNYSHFSQASANLQGPSRTTELFHPtLASISSPM 
LEGAELYFNVDHGYTjEGLVRGCKASLLTQQDYINLVQCETLEDL 
KIHLQTTDYGNFIANHTNPLTVSKIDTEMRKRLOGEFEYFRNHS 
LE PLS TFLTYMTCS YMI DNVI LLMNGALQKKS VKEILGKCHPLG 
RFTEMEAVNIAETPSDLFNAILIETPLAPFFQDCMSEKALDELN 
IELLRNKLYKS YIiEAFYKFCKNHGDVTAEVMCP ILEFEADRRAF 
1 1 TLNS FGTELS KEDRETL Y PTFGKL YPEGLRLLAQAEDFDQMK 
NVADH YG VYKPLFEAVGGS GGKTLE DVFYERE VQMUVIiAFNRQF 
H YGVF YAYVKLKEQEI RN I VWIAECI S QRHRT KIN3YIPIL 


6569 


205 


1532 


RRRGPQRLGHGRPTPLLCRWRTA^PSHWfiKQARAFQGLRPVDPR 
RMS WL FPLTKSAS S S AAGS PGGI/TS LQQQKQRLI ESLRNSHSS X 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVYVTS PLVNNFTMHSDLGKI IQSLLDEFWKNPP VTiA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVSSSTTSHTTAKPAAPSFGV^SNLPLPIPTVnASIPTSQNG 
FGYKMPDVPDAFPEI*SELSVSOrjTDMMROPP , VT.T.PnTrT. , T ? T DfiT V 

QI ITDKDDliVKS IEELARKNLLLEPSLKAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELS ESCSASALQARLKVAAHEAEEESDNIAED 
FLEGKMEIDDFLSSFMEKRTICHCRRAKEEKLQQAIAMHSQFHA 
PL 


*570 


330 


1304 


ARLPRLTFI.REGFLYVLL9HWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGS KALPAPI PLHPSLQLTNYS FLQAVNTFPATVDHLQG 
LYGLS AVQTMHMNHWTLGY PNVHEITRSTITEMAAAQGI.VDAR? 
P FPALPFTTHI1FHPKQGAIAHVI1PALH KDRPRFDFANLAVAATQ 
EDPP KMGDLSKLS PGLGS PI SGLS KLTPDRKPSRGRLPSKTKKE 
F I CKFOGRHFTKS YWLLIHERTHTDBRPYTCD I CHKAFRRQDHL 
RDHRYIHS KEKPFKCQECGKGFCQSRTLAVHKTLHKQTSSPTAA 
SS AAKCSGETVI CGGT 


£571 


169 


656 


APDMNRKKLQKLTDTIjTKKCKHLFRGFDKDNDGCVNVLBWJHG£~ 
SI*FLRGSLEEKMKYCFEVFDIjNGDGFISKEEMFHMLKNSLLKQP 
SEEDPDEGIKDLVEITLKKMDHDHDGKLSFADYEIAVREETLLL 
EAFGPCLPDPKSQMEFEAQVFKDPNE FNDM 


6572 


49 


1646 


TP ERAQ PGALLGAAG CC VCGGRWW PRSHERG YFSSAKMGSKRRN 
LSCSERHQ KLVDBNYCKKLHVQALKNVNSQIRNQMVQNSNDNRV 
QRKQFLRLLQNEQFEIjDMEEAIQKAEENKRIjKELQLKQEEKLAM 
ELAKLKHESLKDEKMRQQVRENSIELRELEKKLKAAYMNKERAA 
Q IAEKDAI KYEQMKRDAEIAKTMME EHKRI I KEENAAEDKRNKA 
KAQYYLDLEKQLBEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED 
QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKI'I 
EFANMQQOREEDRMAKVQENBEKRLQLQNALTQKLEEMLRQRED 
LEQVRQEL YQEEQAE I YXS KLKEEAEKKLRKQKEMKQDFEEQMA 
L KEL VLQAAKEBEENFR KTMLAK FAEDDR I E LMNAQ KQRMKQLE 
HRRAVEKLIEERRQQFLADKQRELBEWQLQQRRQGPINAIIEBB 
RLKLLKBHATNLLGYLPKG7FKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGC^ESOSPfeAQDGTRTPATfidLMYLQGPRmn^YDMVQK 
LFLDFFRRRLSQRPTAEELEQRNI LKPRNEQEEQEEKRE I KRRL 
TRKLSQRPTVEELRERKrLTRFSDYVEVADAQDYDRRADKPWTR 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - " 
(A«Alanine, C«Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
w»irypcopnan, Y=Tyrosme, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTAADKVSRGECWRVGGRTVCWVSLGSPLGSV 


6574 


204 


1159 


LESSVPVSVGVFWACGVSWTGAAGLQDGALSDTMARNAEKAMTA" 
LARFRQAQLEEGKVKERRPPLASECTBLPKAEKWRRQIIOBISK 
KVAQIQNAGLGE PR I RDLNDE INKLLR E KGHWEVRI KELGGPDY 
GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 
PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 
KWKAEREAR LARGE KEEEEEEEEEIN I YAVTEEESDEEGS QEKG 

GDDSQQKFIAHVPVPSQQEIBEALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 


6575 


117 


820 


SPAIASQSGGITEBKMLEPQENGVIDLPDYEHVEDETFPPFPPP 
AS PERQDGEGTEPDEESGNGAP VPVPPKRTVKRN IPKLDAQRL I 
SERGLPALRHVFDKAKFKGKGHEAEDLKMLIRHMEHWAHRLFPK 
LQFEDFIDRVEYLGSKKEVQTCLKRIRLDLPILHEDFVSNNDEV 
AENNEHDVTSTELDP FLTNLS ES EM FASELS I S LTEEQQQRIER 
NKQLALERRQAKLP 


6576 


1 


1060 


pepqalvgqkrgalrllvarlvltvsapaevrrrvlrpvlswRd" 
retraladshfrglgvdvpgvgqapgrvafvsepgafsyadfvr 

GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRTYGDV 
WP VANOGVQEYNSNPKEHMTLRD YI TY WKE YIQAG YSS PRGCL 
YLKDWHLCRDFPVEDVFTLPVYFSSDWLNEFWDALDVDDYRFVY 
AGPAGSWSPFHADIFRSFSWSVNVCGRKKWIiLFPPGQEEALRDR 
HGNLPYDVTS PALCDTHLHPRNQLAGPPLE ITQEAGEM VFVPSG 
WHHQVHNLVMCCFSCPLSGAFLQEDGSTTS PLSQ PELGWNGVAH 
G 


6577 


2271 


387 


S DRMAS DD FD I VT EAMLEAP YKKEEDEQQRKE VKKDYPSNXTSS " 
TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
REKSPVREPVDNLSPEERDARTVFCMQLAARIRPRDLBDFFSAV 
GKVRDVRI I S DRNSRRS KG IAYVEFCEI QS VPLAIGLTGQRLLG 
VPI I VQASQAEKNRLAAMANNLQKGNGGPMRIiYVGSLHFNITED 
MLRGI FEPFGKIDNIVLMKDSDTGRSKGYGFITFSDSECARRAL 
EQLNGFELAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAG I QLPSTAAAAAAAAAAQAAALQLNGAVPLQA 
LNPAALTALSPALNLASQCLQLSSLFTPQTM 


6578 


371 


1489 


PSSSATMNRAPLKRATIIjHMALTGASDPSAEAEANGEKPFLLRA " 
LQIALWSLYWVTS ISMVFLNKYLLDS PSLRLDTP I FVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFIG 
MITFNNLCLKYVGVAFYNVGRSLTTVFNVLLS YLLLKQTTS FYA 
LLTCGI I IGGFWLGVDQEGAEGTLSWLGTVFGyLASLCVSLNAI 
YTTKVLPAVDGS I WRLTFTNNVNACILFLPLLLLLGELOALRDF 
AQLGSAHFWGMMTLGGLFGFAIGYVTGLQIKFTS PLTHNVSGTA 
KACAQTVLAVLYYEETKSFLWWTSNMMVLGGSSAYtWVRGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 


2 


711 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT 
IYMGKDKYENEDLI KHGWPEDI WFHVDKLSSAHVYLRLHKGENI 
EDI PKEVLMDCAHLVKANS IQGCKMNNVNWYTPWSNLKKTADM 
DVGQIGFHRQKDVKIVTVEKKVNEILNRLEKTKVERFPDIAAEK 
ECRDREERNEKKAQIQEMKKREKEEMKKKREP4DELRSYSSLMK7 
ENMSSNQDGNDSDEFM 


6580 


62 


1571 




LVALKNWKPKGTNI PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP" 

rppqeqvgplmvkveekeekgkylpslemfrqrfrqfgyhdtpg 
prealsqlrvlccewlrpeihtkeqilellvleqfltilpqelq 
awvqehcpesaeeavtlledlereldepghqvstppneqkpvwe 
kisssgtakespssmqpqpletshkyeswgplyiqesgeeqefa 
qdprkvrdcrlstqheesadeqkgseaeglkgdiisvi iankpe 
aslerqcvnlenekgtkpplqeagskkgresvptkptpgerryi 
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amino acid 
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Predicted end 
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location 
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amino acid 
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Amino acid segment containing signal peptide - 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F«Ph«m/l »l anf im r2=fsiwino 
H=Histidine, I*Isoleucine, K«»Lysine, 
L=Levicine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *-Stop 
Codon, /opossible nucleotide deletion, 
\apossible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCTKCGKAFSHSSNLTt 
HYRTHLVDRPYDCKCGKAFGQSSDLLKHQRMHTEEAPYQCKDCG 
KAFS GKGS L IRH YR I HTGEKP YQ CNE COKS FS QHAGLS SHQRLH 
TGEKPYKCKECGKAPNHSSNFNKHHRIHTGEKPYWCHHCGKTPC 
S KSNLSKHQRVHTGEGEAP 


6581 


228 


476 


RVFLKDLSSTPMASNNTASIAQARKLVEQLKMEANIDRIKVSKA 
AADLMAYCEAHAKEDPLLTPVPASENPFREKKFFCAIL 


6562 


1428 


718 


CFTTKTHCSPVSVPYLSPLVLRKELESIiLENEGbOVIHTSSFlN 
QHPI I FWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLS 
QDSKLVYIQLLWDNINLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVETI RQS IQHNNVLKPINLLSQQMKPGMKRQRSLYRE 
ILPLSLVSLGRENIDIEAFDNEYGIAYNSIiSSEILERLQKIDAP 
PSASVEWCRKCFGAPLI 


6583 


487 


41 


RI PSMTSGRLRWRCTWRPATALWSASLRLGTSSMHPS PRS ISLP 
LSMMLSPLPSNTRGIiSPTALFRSPDSEHATSCPRLHLWRCRAPI* 
RSPSPIjGRLQVLPRSPLHVHTHNSGKEVLGLQVQRSRSGTGPAC 


" 6564 


189 


1750 


PLPMAALGPS S QNVTE Y WRVPKNTTKKYN I MAFNAADKVNFAT 
WNQARLERDLSNKK1YQEEEMPESGAGSBFNRKLREEARRKKYG 
IVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTEKTSYYIFTQ 
CPDGAFEAFPVHNWYNFTPLARHRTLTAEBAEEEWERRNKVLNH 
FS IMQQRRLKDQDQDBDEEEKE KRGRRKAS ELR IHDLEDDLEMS 
SDASDASGEEGGRVP KAKKKAPLAKGGRKKKKKKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQS 
DSSEESEEEKPPBEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKPJSRKPSGGSSRGNSRPGTPSAE 
GGSTSSTLRAAASKLEQGKRVSBMPAAKRLRLDTGPQSLSGKST 
PQPPSGECTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTKK 
TGLSSEC^VNVLAQILKRLNPERKMINDKKHFSLKE 


6585 


3 


1678 


KWKEQRAAQKADVI iTTGAGNP VGDKLNVITVGPRG PLLVQDWF 
TDEWAHFDRERIPERVVHAKGAGAFGYETEVTHDITKYSKAKVPE 
HIGKKTPI AVRFS WAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTP I FFI RD P I LFP S PIHSQKRNPQTHLKDP DMVWDFWSLR 
P ESLHQVSFLF8DRGI PDGHRHMNGYG3HTFKLVNANGEAVYCK 
FHYKTDO^IKNLSypDAARLSQEDPDYGIRDLFNAIATGKYPSW 
TFY I Q VMTFNQ AE T FP FN PFDLiTKVW PHKDYP L I P VGKL VLNRN 
P VNYFAEVEQ IAFD PSNMP PG I KAS P DKMLQGRLFAYPDTHRHR 
LGPK YLH IPVNCP YRAR VANYQRDGPMCMQDNQGGAPNY YPNS F 
GAPEQQPSALBHS I Q Y SGE VR RFNTAND DNVTQ VRAFYVNVLNE 
EQRKRLCENI AGHLKDAQI F IQKKAVKNFTEVHPDYGSHIQAIiL 
DKYNAEKPKNAIHTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPEQPA2STSTMPV5GTPAPNKKRKSSKLIMBLTGGGQESSGL 
NLGKKI SVPRDVMLEELSLLTNRGSKMFKLROMRVEKFI YENHP 
DVFSDS SMOHFQ KFLPTVGGQLGTAGQGFS YS KSNGRGGSQAGG 
SGSAGQYGSDQQHHLGSGSGAGGTGGPAGQAGRGGAAGTAGVGB 
TGSGDQAGGEGKH ITVFKTYI SPWERAMGVDPQQKMELG IDLLA 
YGAKAELP K YKS FNRTAMP YGG YEKASKRMTFQMPKV 


*58V 


75 


1117 


RRVPSLGKMPECWDGEHDI ETPYGI>I>HWIRGS PKGNRPAI LTY 
HDVGLNHKLCFNTFFNFEDMQE ITKHFWCHVDAPGQQVGASQF 
PQGYQFPSMEQLAAMLPSVVQHFGFKYVIGIGVGAGAYVLAKFA 
L I FPDL VEGLVL VN I DPNGKG W I DWAATKLSGLTS TL PDT VLS H 
LFSQBELVNNTELVQSYRQQIGNWNQANLQLFWNMYNSRRDLD 
INRPGTVPNAKTLRCPVMLVVGDNAPAEDGVVECNSKLDPTTTT 
FLKMADSGGLPQVTQPGKLTEAFKYFLQGMGYMPSASMTRLARS 
RTASLTSASSVDGSRPQACTHSESSEGIiGQVNHTMEVSC 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine , G=Glycine, 
HnHietidine, I»Isoleucine, K=»Lysine, 
L=»Leucine, M=Methionine, NsAsparagine, 
P«Proline, Q=»Glutamine, R=Arginine, 
S=Serine, T*Threonine, VoValine, 
^Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /«possible nucleotide deletion, 
\»possible nucleotide insertion) 


6588 


137 


501 


LGLQAQLLEIiRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AQKALSKSKKAQEVEVLLSENEMLQAKLHSQEEDFRLQNSTLMA 
EFSKLCSQMBQLEQENQQLKEGAAGAGVAQAGP 1 


6589 


i 2 


1405 


RPWGSAMATFSRQEFFO^LLQdtLtPTAQOGLDQIWLtLAtCIA 
CRLLWRLGLPSYLKHASTVAGGFFSLYHFFQLHMVWVVLLSLLC 
YLVL FLCRHS SHRG VFLS VTI L I YLLMGEMHMVDTVTlflHKMRGA 
QMI VAMKAVS LGFDLDRGEVGTVPSP V£ FMGYLYFVGTI VFGPW 
ISFHSYLQAVOX^PLSCRWLQKVARSriALALLCLVI^TCVGPyL 
FP YF I PLNGDRLLR1JKKRKARGTMVRWLRAYES AVS FHFSN YFV 
G FLS EATATLAGAGFTEEKDHLE WDLTVS KPLNVBI* PRSMVEVV 
TSWNLPMS YWLNNYVFKNALRLGTFSAVLVT YAASALLHGFS FH 
LAAVLLS LAF ITYVEHVLRKRLAR ILSACVLS KRCPPDCSHQHR 
LGLGVRALNLLFGALAIFHLAYLGSLFDYDVDDTTEEQGYGMAY 
TVHKWSELSWASHWVTFGCWIFYRLIG 


6590 


2177 


656 


VRAY3HVLS LLENVFTPMFCHRDE YFRQLLRGAES PTRNS KLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGI WMEDDSPVEAVSTPNTPRNLAAMKI S IPY 
VDFFEDPSS ERJCEKKERI PVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLBSKLTEFHGAFPDAQLPSKRI 1GPKNYEFLKS KREE 
FQEYLQKLLQHPELSNSQLLADFLS PNGGETQFLDK I LPD VNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCBSPKPKPSRPELTIL 
S PTSENNKKLFOT5LFKNNANRAENTERKQNQITYFMEVMTVEGVY 
DYLMWGRWFQVPDWUIHIiLMGTRILFKNTIiEMYTDYYLQCKL 
EQLFQEHRLVSLXTIiIjRDAIFCEKTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRI/LFDGLQQPVLNKQLTYVLEJ3I 
VIQELFPELNKVQKEVTSVTSWM 


6591 


2177 




VRAYEHVLSiXENVFTPMFCHRDEY FRQIiIiRGAESPTRNSKLNR 
GSLSLDDFRNTQKRGBSFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDFFBDPSSBRKEKKERIPVFCIDVERNDRRAVGHEPBHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQIiPS KRI IGPKNYEFLKS KREE 
FQBYLQKLLQHPELSNSQIiLADFLSPNGGETQFLDKILPDVNtiG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPEIiTIL 
SPTSENNKKLFNDLFKNNANRABNTEIUCQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWIJniLIJ^TRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVS LI TLLRDAI FCBNTEPRSLQDKQKGAKQTFEEM 
MNYI PDLLVKC IGE ETKYE S I RLL FDGLQQPVLNKQLT YVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEQRLKAIVAEKFAIATKEG 
DLPQVERFFKlFPLLGLHBEGIiRKFSEYLCKQVASXAEBNLLMV 
LGTDM S DRRAAV I FADTLTLLFEG IARIVETHQP IVETYYGPGR 
LYTLIKYl^VECDRQVEKVVDKFIKQRDYHQQFRHVQNNLMRNS 
TTEKIEPREI^PILTEVTLMNARSELYLRFLKKRISSDFEVGDS 
MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVTMEEYFMRE 
TVN KAVALDTYE KG QLTS SM VDD VFY I VKKC I G RALS S S S IDCL 

CAM INLATTELBS DFTOVLCNKLRMGF PATTFQD IQRGVT5AVN 
IMHSSLQ^GKFDTKGlESTDEAXMSFLVTLNNVEVCSENISTLK 
KTLESDCTKLFSQGIGGEQAQAKFDSCLSDLAAVSNKFRDLLQE 
GLTELNS TAIKPQVQPW INS F FS VSHNI EEEE FNDY EAND P WVQ 
QF I LNLEQQMAE FKASLS PVI YDSLTGLMTSLVAVELE KWLKS 
TFNRLGGLQ FDKELRSL I AYLTTVTTWTI RDKFARLSQMATI LN 
LERVTEILDYMGPNSGPLTWRLTPAEVRQVLALRIDFRSEDIKR 
LRL 


6593 


3 


1837 


EAFS AGSRRRG LALQ RGVLGGLGG Y CP CCCRRRGRLLVLLLLVR " 
RGGEGGGGRGRGDKRRRRQARRQRRRPE PAEARGGKMADVLS VL 
RQYNI QKKE I WKGDEVI FGE FS WP KNVKTNYVVWGTGKEGQPR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino ari rf 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K« Lysine, 
LaLeucine, M»Methionine, N=Asparagine, 
paProline, QoGlutaraine, RsArginine, 
S«Serine, T=Threonine, Wvaline, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 

Luuuii, / n puaaiDic uucieuciae aei€Clon r 
\=possible nucleotide insertion) 








EYYTLDS I LFLLNNVHLSHP VYVRRAATENI PVVRRPDRKDLLG 

YLNGEASTSASIDRSAPLEIGLQRSTQVKRAADBVLAEAKKPRI 

EDEBCVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 

I KAKIMAKKRST I KTDLDDDI TALKQRS FVDAEVDVTRDIVSRE 

RVWRTRTTI LQSTGKNFSKNI FAI LQSVKAREEGRAPEQRPAPN 

AAPVDPTLRTKQP I PAAYNRYDQERFKGKEETEGFKI DTMGTYH 

GMTLKS VTEGASARKTQTPAAQP VPR P VS QARP P PNQKKGSRTP 

Ui IPAATTSJjITMIjNAKDIiIjQDLKFVPSDEKK^ 

I QRRKDQMQ PGGTAI S VTVP YRWDQP LKLMPQD WDR WAVFVQ 

GPAWQFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 

DVTVLEIiSYHKRHLDRPVFLRVWETLDRYMVKHKSHLRF 


6594 


1 


1096 


EFPGRRFRGSQASPXiCATC3GPALLRAPTRAAMTRSLFKGNFWSA 
D I L S TIG Y DN I IQHLNNGRKNCKE FED FLKERAA I E E R YGKDLL 
NbS RKK P CGQSEINTLKRALE VFKQQ VDNVAQCHIQLAQSLREE 
ARKMEEFREKQKLQRKKTEL IMDAIHKQKSLQFKKTMDAKKNYE 
QKCRDKDEAEQAVSRSANLVNPKQQEKLFVKLATSKTAVEDSDK 
AYMLHIGTLDKVREEWQSEHIKACEAFEAQECERINFFRNALWL 
H VNQL5QQCVTSDEMYEQVRKSLEMCS I QRD I E YFVNQRKTGQ I 
PPAP IMYENFYSSQKNAVPAGKATG PNLARRGPLPI PKS S PDDP 
NYSLVDD YS LLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWIjYLH 
RYNAYPSEQ EKLSLS GQTNLSVLQ I CNWFINARRRLLPDMLRKD 
GKDPNQFTISRRGGKASDVALPRGSSPSVLAVSVPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTLTLLTRAEA 
GS PTGGLFNTPPPTPPEQDKEDFS SFQLLVEVAIjQRAAEMELQK 
QQDPSLPLLHTPIPLVSENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
P I YQLNAPWLKGQERADLSNSLEE I YIQNIGES ILYLWVEKTRD 
VLi I^AbUM 1 Js^GPDVKKKTEEED VECEDDL I LACQPESSVKALD 
FD 2 S ETRTEVE VEELPP IDHGI P I TDRRSTFQAHLAP WC P KQV 
KM VLS KLYENKKI ASATHN I YAYR I YCEDKQT FLQDCEDDGETA 
AGGRLLHLMEIIiNVKimmArSRVryGGItiLGPDRFKHINNC^N 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6597 


2 


1026 


PRLPVRRYHGRRRL.QGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 

(jc*il Vt 4- V X uU<~t\lS.L r L- J.K ±&UU JL UJJr'lS.W 1 JjtJjy VHLir'riii I t\j J. AP 

P X YQLNAPWLKGQ E RADLSNSLEE 1 Y I QNIGES I LYLWVEK I RD 
VL1QKSQMTEPGPDVKKKTEEEDVECEDDLIIACQPESSVKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYBNKKIASATHNIYAYR1YCEDKQTFLQDCEDDGETA 
AGGRLLHIiMEILNVKNVMVVVSRVIYGGILLGPDRFXHINNCARN 
ILVEKNYTNSPEESSKALQKNKKVRKDKKRNEH 


, 6598 


1099 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLS FCRLHKQS S MT VMEAQE S PLFNNVKLQRKLPVES IQI VLEE 
LR KKGNIi&WLDKS KSS FL2 MWRRPEEWGKL I YQWVS RSGQNNS V 
FTLYELTNGEDTEDEE FHGLDEATLLRALQALQQEHKAE I IT VS 
DGPRRQVLLAGTCLPLLLTSHLSRAFKRRQTQCPPKTGSVTPPD 
SKGLQS 


6599 


164 


1593 


KMAALTTLF KY I DENQDRYI KKLAKWVAI QS VSAWPE KRGE I RR 
MME VAAAD VKQLGGS VELVD I GKQKL PDGSSI PLP P I LLGRLGS 
DP QKXTVCI YGHLDVQP AALEDGNDS EP FTLVERDG KLHGRGS? 
DDKGPVAGWINALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
LIFARKDTFFKDVDYVCISDNYWLGKKKPCITYGLRGICYFFIE 
VE CSNKDLHSG VYGGS VHEAMTDL I LLMGS LVDKRGNILI PG2N 
E AVAAVTREEHKIiYOD IDFDIEEFAKDVGAQILIiHS HKKD ILMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
LoLeucine, M«Methionine, N>Asparagine , 
P- Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








VVGEQVTS YLTKKFAELRS PNE FKVYMGHGGKPW VSD FS HPH YL 
AGRRAMKT VFG VEPDLTREGGS I P VTLTFQEATGKNVMLL P VGS 
ADDGAHSQNEKLNRYNYIEGTKMLAAYLYBVSQLKD 


6600 


2 


934 


PGRLFRVAAME S AGLEqLLRELlLPDTER I RRATEQLQ I VIiRAP 
AALSALCDLLAS AADPQ I RQFAAVLTRJRRLNTRWRR ioAAEQ RE S 

QHSTHSPHSPEREMGLLLLSVWTSRPEAFQPHHRELLRLLNBT 
LGEVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMIiVPICLXMAMQ 
TLIPIDEAKACEALEALDEIjLESEVPVITPYLSEVLTFCLEVAR 

NVALiGNAT ft TP T V.CC1 .T FT ,VXVK <5 If AT ,T .KTTR T .7 J1TT .A DHC 

GC 


6601 


529 


1420 


PRAAARAPP PAVLRRDRRAATAPGAGEMTJjHGPIjAQRYFLNHIE 
KI TTWQD P RKAMNQ PLNH MNLH P AVSS TPV PQRS MAVS Q P NL VM 
NHQHQQCMAPS TLSQQNHPTQNP P AGLMSK PNALTTQQQQQQKL 
RLORIOMERERIRMROEEI»MRORAAT.r^nT.PMRAFTT.APvnAAir 

npptmtpdmrsitnnssdpflnggpyhsreqstdsglglgcysv 
pttpedflsnvdemdtgenagqtpmninpqqtrfpdfldclpgt 
nvdlgtles edl i plfndvesalnks3 p fltwi* 


6602 


127 


617 


LIJ)FPALPKFVI^QSPKAGKPSTMTSMTQSI^EVIKAKTKARNF 
ERVJjGKITLVSAAPGKVICEMKVEBEHTNAIGTLHGGLTATLVD 

NT STMATjTiPTRPnADnvQVnMMT'PVMQ DAIfT.f2PnTirr»TiVTnrr VTi 

GKTLAFTSVDIiTNKATGKLIAQGRHTKHJjGN 


6603 


79 


660 


PVGPSSLAARTGLGHLPFLHRJjASSRGLDMDT^ 
SGMGATGTLRTSLDPSLE I YKKMFBVKRRE0LLAL}Q7LAQLiNDX 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFSHWENTAFFGDWLRFPRIVHYYFDHNSNWNLIiIRWGISFC 

nqtgvfnqgphspilslm 




3 


688 


ggggggnfrgggrggfgrgggrggfnkgqdqgpperwllgefl 

HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEI FGQLR 

dfyfsvklsenmkassfkklqkfyidpykllplqrflprppgek 
gpprgggrggrgggrggggrgggrgggfrggrggggggfrggrg 
ggfrgrgh 


6605 


7 


948 


sgs rrgamraag vglvdchchlsapdfdrdldd vlekakkanw 
alvavaehsgefekimqlseryngfvlpclgvhpvqglppedqr 
svtlkdldvalpi ienykdrjjlaigevgldfsprfagtgeqkee 
qrqvlirqiqlakrlntjpvnto 

hafdgrpsvambgvragyffs ippsi irsgcxjklvkqiipltsic 
letds palgpekqvrnepwni s i s ae yiaqvkgi s vee vie vtt 
qnalkl f p klrhllq k 


6606 


2 


16B2 


FVE I R P RAE VANLS AHS AS P I QDAVLKRLSLLED I V YRQLNGLS 
KSLGLIEGYGGRGKGGLPATLSPAEBE KAKGPHBKYGYNS YLSE 
KI SXiDR S I PDYRPTKCKELKYSXDLPQ IS 1 1 FI FVNEALSVI IiR 
SVHSAVNHTPTHLLKEI ILVDDNSDEEELKVPLEEYVHKRYPGL 
VKWRNQKREGL I RAR I EGWKVATGQVTGFFDAHVEFTAG WAE P 
VLSRIQENRKRVILPSIDNIKQDNFEVQRYENSAHGYSWELWCM 
YI SPP KDWWDAGDPSLPIRTPAM IGCS FWNRKFFGE IGLLDPG 
MDVYGGENIEXGIKVWLCGGSMEVLPCSRVAHIERKKKPYNSNI 
GFYTKPJNALRVAEVWMDDYKSHVYIAWKLPLENPGIDIGDVSER 
RALRKSLKCKNFQ W YLDHVYPEMRR YNNT VAYGE LRNNKAKDVC 
LDQG PLENHTAI Ii Y PCHGWG PQLAR YTKEGFLHIiGALGTTTIiLP 
DTRCIiVDNSKSRLPQLLOCDKVKSSLYKRWNFIQMGAIMNKGTG 
RCLEVENRGLAGIDLILRSCTGQRWTIKNSIK 


6607 


137 


9B4 


VPAf^I^KFJ^SrjTJ^PPRl^TiaiQASCPJUiFSPPIQSRQTT 
GISFGGRGGAGPGVPTRTQVFAAMGAVMGTFSSLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, DoAspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
LsLeucine , M=Methionine, NoAsparagine , 
P« Proline, QeGlutamine, R»Arginine, 
S= Serine, T= Threonine, V« Valine, 
WoTryptophan, Y-Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








SGVVNEDTFKQIYAQFFPHGDASTYAHYLFNAFDTTQTGSVKFW - 
D FVTALS I LLRGTVHE KLRWTFNLYD INKDG YINQEEMMD IVKA 
I YDMMGKYTYPVLKEDTPRQHVDVFFQ KMDKNKDGI VTLDE FLE 
SCQEDDNIMRSliQLFQNVM 


6608 


224 


1140 


RPCFSSPTGLCPRIjSYPMIIiLQHAVliFPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPLAGBEELSKGGEQDCAIiEELCKPLY 
CKLCNVTLNSAQQAQAHYQGKNKGKKLRNYYAANSCPPPARMSN 
WBPAATP WPVPPQMGS FKPGGRVI LATENDYCKLCDASFS SP 
AVAQAHYQGKNHAKRLRLAEAQSNS FSESSELGQRRARKEGNEF 
KMMPNRRNM YTVQNNSGP YFNPRSRQRI PRDLAMCVTPSGQFYC 
SMCNVGAGEEMEFRQHLESKQHKSKVSEQRYRNEMENLGYV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSW5AVT 
PAAE PGNFQLS PAEPRGPLASPVRAAPRAPCPAAEMSELNTKTS 
PATNO AAGOEE KflKAGNVlGCAFF RE F T n T TTT.T A P F TT7 \C a a T JX T n 
GKFRRFQKKKKDPSS 


6610 


319 


881 


GRKSLCNLH1 FIRFPLTYPDMYMGMMCTAKKCGIRFQPPAI ILI 
YESE I KGKIRQR IMPVRNFSKFSDCTRAAEQLKNNPRHKS YLEQ 
VSLRQLEKLFSFUIGYLSGQ8LAETMEQIQRETTIDPEEDLNKL 
DDKELAKRKS IMDBLFEKNQKKKDDPNFVYDIEVEFPQDDQLQS 
OGWDTESADBF 


6611 


978 


212 


PGCSGAGSRVWWIiPALRHLAMGSTBSSEGRRVS FGVDEBERVRV 
W^VRLSENVVNRMKEPSSPPPAPTSSTFGLQDGNLRAPHKEST 
LPRS GS SGGQQPS GMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHS KASLPTGEGS I SHEEQKS VRLARELESREAELRRRDTFYK 
EQLERIERKNAEMYKLSSEQFHEAASKMESTIKPRRVEPVCSGL 
QAQ ILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKa 


6612 


1724 


992 


VSTHASALSRTQG QPQRQPRAAASGAGAGTAGGGGSGGAEGS KM 
STEAQRVDDSPSTSG3SSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKLSTSAKR I QKELAE I TLD P P PNCSAG P KGDNI YE WRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGV I CLDILKDNWS PALTI SKVLLS I CS LLTDCNPADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


£^13 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSLVLRFVKGTFRESYI 
PTVEDTYRQV1SCDKSICTLQITDTTGSHQFPAMQRLSISKGHA 
FI LVYS ITSRQS LEELKP I YEQICE I KGDV3S IP IMLVGNKCDE 
S PSREVQSSEAEALAR'fWKCAFMETSAKLNHNVKELFQELLNLE 
KRRTVSLQI DGKKSKQQKRKEKLKG KC VI M 


6614 


3 


1191 


SSAAEA^VLVRRCWGPPIJU^GARRGRPSPOWRALARLGWEDCR 
DSRVREKPPWRVLFFGTDOFAREALRALHAARENKEEELIDKLE 
WTMPS PS PKGLPVKQYAVQS QLPVYE WPDVGSGEYDVGWAS F 
GRI>LNEALILKFPYGIIjNVHPSC^PRWRGPAPVIHTVLHGDTVT 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELBAVLSRLGAN 
MLISVLKNLPESLSNGRQQPMEGATYAPKISAGTSCIKWEEQTS 
EQ I FRLYRAIGN I I PLQTLWMANTIKLLDLVEVNSSVLADPKLT 
GQALI PGSVI YHKQSQILLVYCKDGMIGVRS VMLKKSLTATDFY 
NGYLHPWYQKNSQAQPSQCRFQTLRLPTKKKQKKTVAMQQCIE 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSLRLQTDARKVRCILTGHE 
LPCRLPELQVYTRGKKYQRLVRAS PAFD YAE FEPHI VPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACLVHRRRRRBDQMDGIXSPRPREAFrairrSSDEGGAASDDSM 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 
EGRRE TTVYRGL VQ KRG KKQLG SL KKK F KSHHRKPKS FS SCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLSTPSICRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, ^Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=*Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M*=Methionine, N«Asparagine, 
P=Proline, Q«Glutamine, R»Arginine, 
S=Serine, T«Threonine ( V»Valine, 
W«Tryptophan, Y»Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPtWLQPSPPPQSSP 
PPQPHPCHTCRGLVDS FNKGLERTIRDNFGGGNTAWEEENLS KY 
KDSETRLVEVLEGVCSKSDFECHRLLELSEELVESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYFEAERNASHLVCSACF 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEGANCGAD 
QFCVNTEGS YECRJDCAKACLGCMGAGPGRCKKCS PG YQQVGSKC 
LDVDECETEVCPGENKQCENTEGGYRCICAEGYKQMEGICVKEQ 
I PESAGFFSEMTEDELWLQQMFFGII I CALATLAAKGDLVFTA 
I FIGAVAAMTGYWLS ERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVSLLELEDRLQCP I CLEVFKESLMLQCGHS YCKGCLVS 
LSYHLDTKVRCPMCWQAVDGSSSLPNVSLAWVIEALRLPGDPEP 
KVCVHHRNPLSLFCEKDQELI CGLCGLLGSHQHHPVTP ISTVCS 
RMKEELAALFSELKOEQKKVDEIiIAKLVKNRTRlDGSAPSLCPC 
LGPATFTFL 


6618 


548 


136 


dgkvarrapnspafqndi^pLVsaprAttae^pwskvlqntqcJr 
nvpkmtsersripclsaaaaegtgkkqqegramatldrkvpspe 
aflgkpwsswidaaklhcsdnvdleeagkeggksrevmrlnkea 

WKYGT 


6619 


246 


842 


PAS S E VLTAAVMFLLIiNCI VAVSQNMG I GKNGDL PRP PLRNEFR 
YFQRMTTTSSVEGKQNLVIHGRKTWFS IPKKNRPLKDttl NLVLS 
RELKEPPQGAHFLARSLDDALKLTERPELANKVDMIWIVGGSSV 
YKEAMNHLGHLKLFVTRIMQDFESDTFFSEIDLEKYKLLPEYPG 
ILSDVQEGKHIKYKFEVCEKDD 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQE3ALGAYSPVDYMSITSFPRLPE 
DE PAPAAP LRGRKDEDAFLGD PDTDPDS FLKSARLQRLPSSSS E 
MGSQDGSPLRETRKDPFSAAAAECSCRQDGLTVIVTACIjTFATG 
VTVALVMQIYFGDPQIFQQGAVVTDAARCTSLGI EVLSKQGSS V 
DAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHLIDFRES 
APGALRRETLQRS WETKPGLLVGVPC3WKGLHEAHQLYGRLPWS 
QVLAFAAAVAQDGFNVTHDIJu^ALAEQLPPNMSERFRETFLPSG 
R P PL PGSLLHR PDLAEVL0VLGTSGPAAF YAGGNLTLEMVAEAQ 
HAGG VI TE E D F SNYSALVEKP VCG V YRGHLVLS PP P P HTGP AL I 
SALNI LEG FNLTSLVSREQALHWVAETLKI ALALASRLGDP VYI3 
STITESMDDMLSKVEAAYLRGHINDSQAAPAPIjIiPVYEIiDGAPT 
AAQVLIMGPDDFIVAMVSSLNQPFGSGLITPSGILLNSQMLDFS 
WPNRTANHSAPSLENSVQPGKRPLSFLLPTVVRPAEGLCGTYLA 
LGANGAARGLSGLTQVRFTP WLAFFSREPS CGLDCRCLS YLWLV 
SIPHAANMG 


6621 


1 


662 


VQG I TS YQQRLQALRKE KSRDAARS RRGKENFKF YELAKLLPL P 
AAI TSQLDKAS I IRLTISYLKMRDFANQGDPPWNLRMEGPPPNT 
SVKVIGAQRRRSPSALAI EVFEAHLGSHIIiQSIiDGYVFAJuNQEG 
KFLYISETVS IYLGLSQVELTGSSVFDYVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPE PWCFPPASDQFLL 




2 


319 


GRASGAQEETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN 
AEVGKSCI I KRYCEKRFVSKYLATIGIDYGVTKVHVRDREIKVN 
IFDMAGHPFFYEVRKPF 


6623 


1886 


189 j 


KALFEKVKKFRLHVEEGDILYAMYVRQTVLKVIKFLI 1 1 AYNSA " 
LVSKVQFTVDCNVDIQDMTGYKNFSCNHTMAHIiFSKLSFCYLCF 
V5IYGLTCLYTLYWLFYRSLREYSFEYVRQETGFDDIPDVKNDF 
AFMLHMIDQYDPLYS KRFAVFXSEVSENKLKQLNLNNE WTPDKL 
RQ KLQTNAKNRL ELP L I MLSGLPDTVPE ITELQSLKLE 1 1 KNVM 
I P AT I AQ LDNLQ ELS LHQCS VK I HS AAL S FLK ENLKVLS VKFDD 
MRELPPWMYGLRNLEELYLVGSLSHDISRNVTLESLRDLKSIiKI 
LS IKSNVSKIPQAVVDVSSHLQKMCIHNDGTKLVMLNNLKK^ 
LTELELVHCDLERIPHAVFSLLSLQELDLKENNLKSIEEIVSFQ 
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SEQ 
ID 

Si\j : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
HaHistidine, lelsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X -Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKLTSLERLSFSHNKIEVLPSH 
LFLCNKIRYLDLS YND IRFI PPB IGVLQSLQYFS ITCNKVESLP 
DELYFCKKLKTLKIGKNSLS VLS PKIGNLLFLSYLDGKGNHFE I 
LPPELGD CRALKRAGLWEDALFETL PSDVREQMKTE 




218 


1786 


GSRRGGGSRIPAVSTHVAPGRSVIiRPFASGALRLRSLVKALGGC 
RGRPSGLAHLSQETSHWRAKRS GRACLGDFPGE ILRS FIMKCTA 
RBWLRVTTVLFMARAI PAMWPNATLLBKLIjEKYMDEDGEWWIA 
KQRG KRAI TDNDM Q S I LDLHNKLRSQVYPTASNME YMTWDVELE 
RSAESWAESCLWEHG PAS LLPS I GQNLGAHWGRYRPPTFHVQS W 
YDEVKDPSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AI NLCHNMN I WGQ I WP KAVYLVCNY S PKGNWWGHAPYKHGRP CS 
ACPPSFGGGCRENLCYKEGSDRYYPPREEETNEIERQQSQVHDT 
HVRTRSDDSSRNEVIS AQQMSQIVS CEVRLRDQCKGTTCNRYEC 
PAGCLDSKAKVIGS VHYEMQSS 1 CRAAIHYGI IDNDGGWVDITR 
QGRKHYF I KSNRNGI QTIGKYQSANSFTVS KVTVQAVTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF 


6625 


1124 


543 


PGPRGGGGSLI^TKALdRSRGIiG^PGPSSGdflB^VPTALRPP 
GPLVPSTSDDNLLKNIELFDKLALRFHGRLLFLKDVLGDEICCW 
S FYGQGRKI AEVCCTS I VYATEKKQTKVBFPEARIFBETLNILI 
YE TPRGPDPALLEATGGAAGAGGAGRGEDE ENREHRVRR IHVRR 
HITHDERPHGQQIVFKD 


6626 


3 


1498 


SAVEFVYTDRFHLILG ISVEFLCSLRSDATMES ITACLHALQAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQI ICAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKS L VFATLELCVC I LVRQLPELNP KLTGS PGVKATKPQI LLED 
GSRLVSAALVILSELPAVCSPEGS IS ILPTIL YLriGVLRETAV 
KLPGGQLSST VAAS LQ ALKG I LS S PMARAEKSRTAWTDLLRS AL 
TT I LDCWDP VD ETHQE LDE VS LLTAI TVF I LS TS PEVTT I P CLQ 
KRCIDKFKATLEIKDPWQIKTYQIiLHSIFQYPNPAVSYPYIYS 
LASCIMEKLQEIDKRKPENTAELEIFQEGIKVLETLVTVAEEHH 
RAQLVACLLP I LI S FLLDBNSLGS ATS IMRNLHDFALQNLMQIG 
PQYSSVFKSLVASSPALKARLEAAIKGNQBSVKVKIPTSKYTKS 
9GKNSSXQLJCTS FL 


" £627 


1 


697 


G I PHLSSRDMTGTPGAVATRDGEAPERS P PCS PS YDLTGKVMLL " 
GDTG VGKTCFL I Q FKDGAFLS GTF I ATVGIDFRNKWT VDGVRV 
KLQ I WDTAGQERFRSVTHAYYRDAQALLLLYDI TNKSS FDNIRA 
WLTE I HE YAQRDWIMLLGNKADMS S ERVIRSEDGETLARE YG V 
PFLETSAKTGMNVELAFLAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 


QCAE FGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN 
KEFGDS LS LBI LQ I IKESQQQHGLRHGDFQRYRG YCSRRQRRLR 
KTLNFKMGNRHKFTGKKVTBELLTDimYIJ^VIiMDAERAWSYAM 
QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 
LEAQAYTAYLSGMLRF EHQ E W KAA I EAFNKCKT I YE KLAS AFTE 
BtjH v it x « yit vise 1 a ±*N J. K i wixNJuDsJa AINE LMQMRLRSGGTE 
GLIiAEKLEALITQTRAKQAATMSEVEWRGRTVPVKIDKVRI FLL 
GLADNEAA1VQAESEETKERLFESMLSECRDAIQWREELKPDQ 
KQRD Y I LEGEPG KVSNLQ YLHS YLTYI KLSTAI KRNENMAKGLQ 
RALLQQQPEDDSKRSPRPQDLI RLYDI ILQNLVELLQLPGLEED 
KAFQKEIGLKTLVFKAYRC F FIAQS YVLVKKWSEALVL YDRVLK 
YANEVNSDAGAFKNSLKDLPDVQEL I TQVRSEKCS LQAAAI LDA 
N DAHQTETSSSQ VKDNKP LVERFETF CLDPS LVT KQ ANL VH F P P 
GFQ PI PCKPL FFDLAIiNHVAFPPLED KLEQKTKSGLTGY I KG IF 
GFRS 


j 5653 


4549 


GATPLGS VGGRTGKMDAATLT YDTLRFAEFEDFPETSE PVW I LG 
RKYSIFTEKDEILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline» Q«Glutamine, R=Arginine, 
S=Serine, TVThreonine, V=rValine, 
WeTryptophan, Y*Tyrosine, X«Unknown, *~Stop 
Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








MLROGQMIFAQALVCRHLGRDWRWTQRKRQPDSYFSVLNAFIDR 
KDSYYSIHQIAQMGVGEGKSIGQWYGPNTVAQVLKKLAVFDTWS 
S IAVHI AMDWTWMEEI RRLCRTS VPCAGATA FPADSDRHCNGF 
PAGAE VTNRPS PWRPLVLLI PLRLGLTDINEAYVETLKHCFMMP 
QSLGVIGGKPNSAHYPIGYVGEBLIYLDPHTTQPAVEPTDGCFI 
PDESFHCQHPPC^IMSIAELDPSIAWRGGHLSTQAFGAECCLGM 
TRKTFGFLRPFPSMLG 


6630 


2 


423 


LVQCGGlRRRSAWGAMPGRHVSRVRAIiYKRVLQLHRVLPPDLKS 
LGDQYVKDE FRRHKTVGSDEAQRFLQKWEV YATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQEIiMQEATKPNRQFSl 
SBSMKPKF 


" 6631 


2 


423 


LVOCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS""" 
LGDQYVKDEFRRHKTVGSDEAQR PLQ E WEVYATAL LQQ ANENRQ 
N5TGKACPGTFLPEEKLNDFRDEQIGQLQELMQEATKPMRQFSI 
SESMKPKF 


" $632 


1273 


588 


WN SRGRTQRGAA? LAPAAAMKAWQRVTRAS VTVGGEQI S A¥GR*"" 

GICVLLGISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 

KQYE I LCVSQFTLQCVLKGNKPDFHLAMPTEQAEGPYNS FLEQL 

RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 

KQLSKLEKOX3QRKEICTRAKGPSESSKERNTPRKEDRSASSGAEG 

DVSSEREP 


6633 


1145 


617 


nxumibovr A utiui ^vU^'^'ulJ, X tfr\ 1 XlrOlAjJr WoVJUHoNirFJl/X 

AWGANGLDAIITQLLNQFENTGPPPADKEKIQALPTVPVTEEHV 
GS GLE C P VCKDDYALGERVRQLPCKHLFHDG C I VP WLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNEN'ATSNS 


6634 


1 


1134 


CGGI PRKGSGPRRRLPMARLRDCLPRLMLTLRSLLFWSLVYCyC 
GLCAS Z HIiLKLLWS LGKG PAQTFRR PAREHPPACLSDPS LGTHC 
YVRI KDSGLRFHYVAAGERGKPLMLIjIjHGFPEFW YS WRYQLREF 
KS B YRWALDLRG YGETDAP I HRQN YKLDCL I TD I KDI LDS LGY 
SKCVL IGHDWGGM I AWL I AI CYPEMVMKL I VINP PWPNVPVE vt 
TiRHPAQLLKSSYYYFFQIPWFPEFMFSINDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYI YVFSQPGALSGPINHYRNI FSCLPIiKH 
HMVTTPTLIJ*WGENDAFMEVEMAEVTRFYVKNYFRLTILSEASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQUASMLRWTRAWRLPREGIjGPHGPSFARVPVAPSSSSG'^' 
GRGGAEPRPLPLS YRtjIiDGEAALPAVVFIiHGLFGSKTNFNS IAK 
ILAQQTGRRVLTVDARNHGDS phspdms yeimsqdlqdllpqlg 

LVPCVWGHSMGGKTAMLIiALQRPELVERLIAVDISPVESTGVS 
K7AT YVAAMRAIN I ADELPRSPJ^RKLADEQLSS VIQDMAVRQHL 
LTNLVEVDGRFVWRVNLIXALTQHI^ 

LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRGFLV 


6636 


1514 


1801 


S FCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAB 
QPIVRQCLQRPPLCGVIX3PVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHIXn'CVLDKAGSYKCAaLAGYTGQRCEN^EAGKSKI - " 

KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRIIAKIGT 

VVSFFCNNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 

LVRRRVLPMQVQSRETPLKQLYSAAFS KQKLQSAPTKKPALP FG 

DI^MGYQHLHTQLQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 

CIPICGKIENITAPKTQGLRWPW0^1YRRTSGVH1K3SIjHKGAW 

FLVCS GAL VNERTVVVAAKCVTDLGKVTM I KTADLKVVLGKFYR 

DDDRDEKTIQSLQISAIILHPNYDPILLDADIAILKLLDKARIS 

TRVQPICLAASRDLSTSFQESHITVAGWNVLADVRSPGFKNDTL 

R S G WS WDSLL CE EQHEDHG I P VS VTDNMFCASWE PTAPS D I C 

TAETGG IAAVS FPGRAS PEPRWHLMGLVS WS YDKTCS HRLSTAF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E=< 
Glutamic Acid, F»Phenylalanine, G»Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








TKVLPFKDWI ERNMK 


6638 


1391 


224 


GGI PQAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 
PNSD I DLSNLERLE KYRS FDRYRRRAEQEAQAPHWWRTYRE YFG 
EKTDPKEKIDIGLPPPKVSRTQQLLERKQAIQELRANVEEERAA 
RLRTASVPIiDAVRAEWBRTOGPYHKQRLAEYYGLYRDLFHGATF 
VPRVPLHVAYAVGEDDIiMPVYCGNEVTPTEAAQAPEVTYEAEEG 
SLWTLLLTSLDGHLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 
PP FPARGSGIHRLAFLLFKQDQP I DFSEDARPS PC YQ LAQRTFR 
TFDFYKKHQETMTPAGLSFFQCRWDDSVTYIFHQLLDMREPVFE 
FVRP P P YHP KQKRF PHRQP LRYLDRYRDSHE P TYGI Y 


6639 


204* 


1268 


IGCFIMDGGDDGNLI IKKRFVSEAELDERRKRRQEEWEKVRKPE 
DPBECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDEVSRQQELIEKQRREBELKELKBYRNNIjKKVGISQE 
NKKE VEKKLTVKP I ETKNKFS QAKLLAGAVKHKSSESGNS VKRL 
KPDP E PDDKNQE PS SCKSLGNTSLSGPS IHCP S AAVCIG I LPGL 
GAYSGS SDS ESS SDS EGTI NATGK I VS S I FRTNTFLEAP 


6640 


117 


1043 


VLEPPDVSMAESEDRSLRiVLVGKTGSGKSATANTILGEEIFDS 
RI AAQAVTKNCQ KASREWQGRDLLVVDTPGLFDTKESIiDTTCKE 
ISRCIISSCPGPHAIVLVLLLGRYTEEEQKTVALIKAVFGKSAM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTSKAEKESQVQEIiVELIEKMVQCNEGAYFSDDIYKDTEBR 
LKQREEVLRKIYTDQLNEEIKLVEEDICHKSEEKKEKEIKLLKLK 
YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 


6641 


1 


894 


SAAVGRRSEVR^C^PRPRLRRSARRMBPVPGTDSAP^dLAWSS 
ASAP PPRGFSAI S CTVEGAPAS FGKS FAQKSGYFLCLSSLGSLE 
NPQENWADIQIWDKSPLPLGFSPVCDPMDSKASVSKKKRMCV 
KLLP LGATDTAVFDVRLSGKTKTV PG YLRIGDMGGFAI WCKKAK 
APRP VPKPRGLS RDMQGLS LDAASQPS KGGLLERTASRIjGS RAS 
TLRRNDS I YEAS SL YGI SAMDGVP FTLHPRFEGKS CS PLAFSAF 
GDLTI KSLAD IEEE YNYGFWEKTAAARIjPPSVS 


6642 


22 


1296 


PLEERMMTKMDPNDQAQRDI I FBLRRIAFDAESDPSNAPGSGTE 

KRKAMYTKDYKMLGFT^INPAMDFTQTPPGMI^^ 

HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 

NEGRNDYHPMFFTHDRAFEELFGICIQLLNKTWKEMRATAEDFN 

KVMQWREQITRALPSKPNSLDQFKS KLRSLS YS E ILRLRQS ER 

MSQDDFQSPPIV&LREKIQPEILELIKQQRLNRLCEGSSFRKIG 

NRRRQERFW YCRLALNHKVIiH YGDLDDNPQGBVTFESLQEK I PV 

ADIKAIVTGKDCPHMKEKSALKQNKEVLELAFSILYDPDETLNF 

lAPNKYEYCIWIDGI^ALI^KDMSSELTKSDLDTLLSMEMKLRL 

LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


3049 


2265 


SLHAPAEGRTRGRIAEKPKMLTRKI KLWDINAHITCRLCSGYLI 
DATTVTECLHTFCRSCLVKYLEENNTCPTCRIVIHQSHPLQYIG 
HDRTMQDIVYXLVPGLQEAEMRKQREFYHKU5MEVPGDIKGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLECNSS KLRGLKRKW I RCS AQATVLHLKKFI AK KLNLS S FNE L 
DILCNEEIU3KDHTLKFVVVTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


FRPLATEPRGSSPVQLVSSTMSVRTLPLLFLNLGGEMLYILDQR 
LRAQNI PGDKARKVLNDI ISTMFNRKFMEELFKPQELYSKKALR 
TVYERLAHAS IMKLNQASMDKLYDLMTMAFKYQVliLCPRPKDVL 
LVTFNHLDTIKGFIRDSPTILQQVDETLRQLTEIYGGLSAGEFQ 
LIRQTLLI FFQDLHI RVSMFLKDKVQNNNGRFVLPVSGPVPWGT 
EVPGLIRMFNNKGEBVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KLGTNMYSVNQPVETHVSGS SKNLASWTQES IAPNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
EVINIQATQDQQRSEELARIMGEFEITEQPRLSTSKGDDLLAMM 
DEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G-Glycine, 
HoHistidine, I=Xsoleucine, K«Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
WaTryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


" 6645 


6S30 


4646 


FVEGLAGYVYKAASEGKVLTLAALLLNRSESDIRYIiLGYVSQQG " 
GQRSTPLI IAARNGHAKWRLLLEHYRVQTQQTGTVRFDGy V1D 
GATALWCAAGAGH FE VVKLL VSHGANVNHTTVTNSTPLRAACFD 
GRLDIVKYLVENNANISIANKYDNTCLMIAAYKGHTDWRYLLE 
QRADPKAKAHCGATALHFAABAGH ID IVKELI KWRAAIWNGHG 
MTPLKVAAESCKADWELLTiSHADCDRRSRI EALELLGAS FAND 
REN YD 1 1 KTYH YLYLAMLERFQDGDNI LEKE VLPPIHAYGNRTE 
CRNPQBLESIRQDRDAIiHMEGLI VRERILGADNIDVSHPI IYRG 
AVYADNME FEQC I KLWLHALHLRQKGNRNTH KDLLRFAQVFS QM 
IHLNETVKAPD I ECVLRCS VLE I E QS MNRVKN I S DAD VHNAMDN 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFTLLHLAVNSNTPVDDFHTNDVCSFPNALVTKLLLDCGAEVNA 
VDNEGNSALHI I VQYNRP I SDFLTLHS 1 1 1 S LVEAGAHTDMTNK 
QNKTPLDKSTTGVSE I LLKTQMKMSLKCIAARAVRAND IN YQDQ 
I PRTLEE FVGFH 


6646 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASIjRNIHSINPTQIiMARIESY 
EGREKKGI SDVRRTFCIiFVTFDLLFVTLIiWI IELNVNGGIENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLIIAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPI1SFILAWIETWFLD 
FKVLPQEAEEENRLL I VQDAS ERAAL I PGGLS DGQFYS PPE SE A 
GSEEABEKQDSEKPLLEL 


6647 


176 


850 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQIiMARIESY 
EGREKKGISDVRRTFCLFVTFDIiLFVTIjIjWI ielnvnggientl 
EKEVMQYDYYSS YFDI FLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTS AFLIAKVI LSKL FSQGAFG YVLP I I5FIIAWIETWFLD 
FKVLPQEABEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDS EKPLLEL 


" *648 


413 


897 


RNCWNCFTKYFNS P PED 1 DH KDS YL I TRS I MAEPDY IEDDNPEL ' 
IRPQ^INPVKTSRNHQDLHRELLMNQKRGLAPQI^PELOKVME 
KRKRDQVI KQKEEEAQKKKSDLE I ELLKRQQKLEQLELE KQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


WIPRAAGlRHEVKWDVKEIMSQHNIYVDAtLKEFEQFNliRJLNBV 
SKRVRIPLPVSNILWEHCIRIJ^TIVEGYANVKKCSNEGRALM 
QLDFQQFLMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWI 
KEHREYSTKQLTNLVNVCLGSHINKKARQKLLAAIDDIDRPKR 


*6S0 


32 


765 


LVPLVFSLLVQSCKQVYRSIAMKFVPCLLLVTLSCLGTLGQAPR 
QKQGSTGEEFHFC^GGRDSCTMRPSSIX3<»AGEVWLRVDCRNTD 
QTYWCEYRGQPSMCQAFAADPKSYWNQALQELRRLHHACQGAPV 
LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLISFFRG 


6651™ 


3425 


1353 


AKELLKVGDFSLCAGP YQNTADTMENI/S KEPLASFVSES FDISA~ 
CGIATEHVKIDNSGEGLTAEAGSETLSRDGEVGVNSDMHYELSG 
DSDLDLLGDCRN PRLDLEDS YTLRGS YTRKKDVPTDGYES S LNF 
niHi\nvC'i>nULaaH v fwjo ioljlrr^inVV Jl AAvi\K±ifiA.L.VPP YVQIR 
DLHG I LRTYANFS ITKELKDTMRTSHGLRRHPSFSANOGLPSSW 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLIiVTWESDP 
RPQGQPRRGYTASSLDSSSSWRERCSHNRDLRNSQRNHTVSFHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSAS YEDI I IDVCIWLHVKLRS WKEA 
CKSTFLFYIiVETEDKSFFVRTKNLLRKGGHTE I E PQHFCQAFHR 
ENDTL 1 1 1 1 RNED I SSHLHQ I PSLLKLKHFPS V I FAGVDS PGDV 
LDHTYQELFRAGGFVISDDKIIiEAVTLVQLKEIIKIItEKLNGNG 
RWKWIiIiHYRENKKLKEDERVDSTAHKKNIMLKS FQSANI IELLH 
YHQCDSRSSTKAEIIaKCLLNLQIQHIDARFAVLLTDKPTIPREV 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S»$erine, T-Threonine, V^Valine, 
W-Tryptophan, Y«Tyrosine, X*Unknown, +»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FENNGILVTDVNNFIENIEKIAAPFRSSYW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRtiSLGAA^PRPRPPSLPLPLPL ' 
PFPLFLPTRPAERAWIRSRRASEWVGKMEVPRLDHALNSPTSPC 
EB VI KNLS LEA I QLCDRDGNKS QDSG I AEM E E LP VPHN I KI SNI 
TCDSPKISWEMDSKSKDRITHYFIDLNKKBNKNSNKFXHKDVPT 
KLVAKAVPLPMTVRGHWFLSPRTEYTVAVQTASKQVDGDYWSE 
WS E I IE FCTADYS KVHLTQLLE KAE VI AGRMLKFS VF YRNQHKE 
YFD YVREHHGNAMQPS VKDNSGSHGSP I SG KLEG I FFS CSTE FN 
TGKPPQDS P YGRYRFE I AAE KLFNPNTNLYFGDFYCMYTAYH YV 
I LVIAPVGS PGDE FCKQRLPQLNSKDNKFLTCTEEDGVLV YHHA 
QD VILE VI YTDP VDLS LGTVAE I TGHQLMS LSTANAKKD PS CKT 
CNISVGR 


6653 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
RVAAAASRGADDAMES S KPG PVQWLVQKOQH S FELDEKALAS I 
IiLQDHIRDLDWWSVAGAFRKGKSFILDFMLRYIiYSQKESGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQSTVKDCATIFALSIWTSSVQIYNLSQNIQED 
DLQQLQLFTE YGRLAMDE I FQKPFQTLMFLVRDWSFP YEYSYGL 
QGGMAFLDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATS PDFDGKLKD LAG E FKEQLQAL I P YVLNPS KLME KE ING 
S KVTCRGLLE YFKAY I KI YQGEDLPHPKSMLQATAEAYNLAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFKQLALDHFKKT 
KKMGGKDFS FRYQQELEEEI KEL YENFCKHNGS KNVFSTFRTPA 
VLFTGIVALYIASGLTGFIGLEWAQLENCMVGLLLIALLTWGY 
IRYSGQYRELGGAIDFGAAYVLEQASSHIGNSTQATVRDAWGR 
PSMDKKAQ 


£6"54 


1 


705 


RTSLS PSQ CS S FNLAMAS AGMQ I LGWLTLLG WVNGLVS CALPM 
WKVTAFIGNS I WAQWWEGLWMS CWQS TGQMQ CKVYDS LLAL 
PQDLQAARALCVIALLVALFGLLVYLAGAKCTTCVEEKDSKARL 
VLTSG I VFVI5GVLTLI PVCWTAHAVIRDFYNPLVAEAQKRELG 
ASLYLGWAASGLLLIjGGGLLCCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 


1* 


KDAYMFKKGLLALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGA 
INI PLKEVKERIATAVPDKNDTVKVYCNAGRQSGQAKE ILSEMG 
YTHVENAGG LKD I AM P KVKG 


6656 


2 


1212 


TELPPRPANLAlQPPLSPIiRALAPLPEKPGAVP.PPQKRMAKVAkl 
DLNPGVKKMSLGQLQSARGVACIjGCKGTCSGFEPHSWRKICKSC 
KCS QEDHCLTSDLEDDRK IGRLLMDS KYSTLTAR VKGGDGI RI Y 
KRNRMIMTNPIATGKDPTFDTITYEWAPPGVTQKLGLQYMEIjIP 

kekqpvtgtegafyrrrqlmhqlp 1 ydqdpsrcrgllenelklm 
eefvkqyksealgvgevalpgqgglpkeegkqqekpegaettaa 

TTNGSLSDPSKEVEYVCELCKGAAPPDSPWYSDRAGYNKQWHP 
TCFVCAKCSEPLVDIilYFWKDGAPWCGRHYCESLRPRCSGCDEI 
I FAED YQRVEDLAWHRKHFVCEGCEQLIiSGRAYI VTKGQLLC PT 
CSKSKRS 


6657 


830 " 


2120 


tLTCQERAGDCLLSASTMkEVVYWSPKKVADWLLENAMPEYCEP 
LEHFTGQDLINLTQEDFKKPPI^CRVSSDNGQRLIiDMIETLKMEH 
HLEAHKNGHANGHLNIGVDIPTPDGSFSIKIKPNGMPNGYRKEM 
IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISWHER 
VPPKEVQPPLPDTFFDHFNR VQWAFS I CE INGM I L VGL WLI QWL 
LLKYKS I ISRRFFCIVGTLYLYRCITMYVTTLPVPGMHFNCS PK 
IjFGDWEAQLRRIMKLIAGGGLS itgshnmcgdylysghtvmltl ■ 
TYLFI KE YS PRRLWWYHWI CWLLS WG I FCILLAHDHYTVDVW 
AYYITTRLFWWYHTMANQQVLKSASQMNLLARVWWYRPFQYFEK 
NVQG I VPRS YHWP FP WPWHL SRQVKYSRLVNDT 


6658 


35 


855 


HCCALGA PGS PYRGL YFS S AAPCTAPRKAKHQSTLEGLTKRMLM 
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Amino acid segment containing signal peptide 
(AaAlanine, C»Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, YaTyrosine, X=Unknovn, *-Stop 
t-oaon, /-possiJoie nucleotide deletion, 
VpoBsible nucleotide insertion) 








FDPVPVKQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 
QTPEGLSHGIGJ1BPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
SPGLSMPSSSPP IKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 
IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 
MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPBEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNUJSLPGL 
RGLTSISRNQLQCTNAMRVINNYQRRWKNQNTFLLATFANWNV 
CGNPTITCPHNRTLNNCHHSGVQVPLMYCWLTTPSPQNISNCRY 
AQTPANMFYI VACDNRDQRRDPPQYP WPVHLHTI I 


QODU 


514 


1707 


CAASLDCRHHJ^EPDMKLVAflPSAKLLQAAAGASARACDSVTSNV 
LPLLLEQFHKHSQSSQRRTIIiEMLLGFLKLQQKWSYEDKDQRPL 
NGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVU3AQPDLLSYED 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VPKLAEBLRVGBSNLTNGDEPTQCSRHLCCLOALSAVSTHPSIV 
KETLPLLLQHLWQVNRGNMVAQSSDVIAVCQSIiRQMAEKCQQDP 
ESCWYFHQTAIPCLLALAVQASMPEKEPSVLRKVLLEDEVLAAM 
VS VIGTATTHLS PEliAAQSVTHI VPLFLDGNVSFLPEWS FPSRF 
QP FQIX3S SGQRRL I ALLMAFVCS LPRNVS EHIWEVLLFNLDKVT 
PG 




179 


430 


GVHAASGTLSATWIAE1AKMFDSLAKAGKYLGQAAKLM IGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


rslpkpapaqpasihcarfsgvtpptaktamsdgntafnaLmVc 

GPKADDGNI FSACAPASSAVKAS VS VAQPGQAVIP 


6663 


3 


1005 


RPVLSSRVDDFVPPLPETSGRRKKLERMYSVDRVSDDIPIRTWF 
PKENLFSFQTASTTMQAISNFRKHLRMVGSRRVKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
BKGLEAVACDTEGFVPPKVMLISSKVPKAEYI PTI IRRDDPSI I 
P I LYDHEHATFED ILEE I ERKLNVYH KG AKI WKMLI FCQGGPGH 
LYLLKNKVATFAKVEKBEDMIHFWKRLSRLMSKVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRLYRLWKSRQHS KLLDFDDVL 


6664 


58 


968 


PRLLRLPRSVWMDSPWDELALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNPXSSWFTAMLHCFGGGILSCIiliLAEPPLKF 
LANHTNI LLAS S I WY ITFFCPHDLVSQGYSYLP VQLLASGMKE V 
TRTWKI VGGVTHANS Y YKNG WIVM I AIGWARGAGGTI ITNFERL 
VKGDWKPEGDE WLKNS YPAKVTLLGS VI FTFQHTQHLA I S KHNL 
MFLYT I FI VATKITMMTTQTSTMTFAPFEDTIiS WMLFGWQQPFS 
S CE KKS EAKS P SNGVGS LAS KP VD VASDNVKXKHT KKNE 


666S ' 


171 


1278 


DERRLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQNPCSVHTATGPEPRKIiPIiLPPDSPNSGYPKEPA 
ALC PGI PS PCRMTHQDLS I TAKL INGGVAGLVGVTCVFP I DLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKAI KIAANDFFRRLLMEDGMQHNLKMEMLAGCGAGMCQVVVTC 
PMEMLKIQLQDAGRLAVHHQGSASAP STSRS YTTGSASTHP rdq 
ATLIAWELLRTQGLAGL YRGLQATLLRDIPFS 1 1 YFPLFANLNN 
LGFNELAGKAS FAHS FVSGCVAGS IAAVAVTPLDVLKTRI QTLK 
KGLGEDMYSGITDCAR 


6664 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTL 
WSPGCWPQPIQKEGVGLWDIRKPQSSIjLRYGGNLSLQSAMSVRF 

nsngtqllalrrrlppvlydihsrlpvfqfdnqvyfnsctmksc 
cfagdrdqvilsgsddfnlymwripadpeaggigrwngafmvl 

KGHRS IVNQVRFNPHTYMICSSGVEKI IKIWSPYKQPGCTGDLD 

grieddsrclytheeyislvlmsgsglshdyanqsvqedprmma 

FFDSLVRREIBGWSSDSDSDIjSESTILQLHAGVSERSGYTDSES 
SASLPRSPPPTVDESADKAFHLGPIJIVTTTNTVASTPPTPTCED | 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
Hx»Hietidine, I«Isoleucine, KsLysine, 
L« Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, v=Valine, 
W=Tryptophan, YoTyrosine, X= Unknown, *«Stop 
Codon, /spossible nucleotide deletion, 
\=possible nucleotide insertion) 








AASRQQRLSAJbRRYQDKRLLALSNBSDSEENVCEVELDTDLFPR 
PRSPSPEDES33SSSSSSSEDBEELNERRASTWQRNAMRRRQKT 
TREDKPS AP IKPTNT Y I GEDNYDYPQ I KVDDLSSSPTSS PERS T 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSYISYSNNKDGE 
TSLVTGEADEGRAGTSHKDNPAPSS SKEACLN I AMAQRNQDLP P 

EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEH?FETKKLHGKALSSRAEEPPSPPVPH^qnQTT.MCr5c:«xir , D 

RTQSDDSEERSLETICANHNNGRLHPRPPHPHNNGQNLGELEW 
AYSS PGHS DTDRDNS S LTGTLLHKDCCGS BMACETPNAGTR KD P 
TDTPATDS SRAVHGHSGLKROR 1 EL30mi?PWQ Q e t?v v t . v«p 


6667 


171 


1310 


AEEVERLAAMRSDSLVPGTHTPP I RRRSKFANl^GRI FKPWKWRK 
K KSEKFKHTSAALERKISMRQSREELIKRG VLKE I YDKDGSLS I 
SNEEDSLENGQSLSSSQLSLPALSEMEPVPMPRDPCSYEVLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPLPPKKVM1 CMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQ1QHQLQYGSHGQHLPSTTGSL 
PMHPSGCT!RMIDELNKTIiAMTMQRI 1 ESSEOnVPr<5'r<5VM<5Qr , T wo 

GOGVTKAGPMGLPEIRQVPTVVIECDDNKENVPHESDYEDSSCL 
YTREEEEEEBDEDDDSSLYTSSLAMKVCKKDSLAIKPSNRPSKR 
ELEEKNI LPRQTDEERLELRQQIGTKL 


6668 


714 


358 


TLAVATGP ALT LRCHV tTS S 3NCKHSWCPASSRFCKTTNTVEP 
LRGtHjVKKDC^ESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKLH 
NAAPTRTALAHS ALSLGLAL S LLAVI LAPSL 


6669 


459 


1207 


KDEETRKDYDYMLDHPEEYYSHYYHYYSRRLAPKVDVRWIIiVS 
VCAISVFQFFSWWNS YNKAI SYLATVPKYRIQATE I AKQQGLLK 
KAKEKGKMKKSKEEIRDEEENIIKNI I KSKIDIKGGYQKPQICD 
LLLFQI ILAF FHLCS YIVWYCRWIYMPNTlfRKPVfiPPRPT.VT TO 

KSMKMSKSQFDSLEDHQKETFLKRELWIKENYEVYKQEQEEELK 
iCKLANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 


184 


594 


VARI * GEAAKMSSE PPPPYPGGPTAPliLEBKSGAPPTPGRSS PA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFI PPHMSADGTYM ■ 
PPGFYPPPGPHPPMGYYPPGPYTPGPYPGPGGHTATVIiVPSGAA 
TTVTV 


6671 


1 


763 


L P AEKP RS A PNMAG GRCGP Q LTAitAAW IAAVAATAGPEEAALP^ 

peqsrvqpmtasnwtlvpflegewmlkfyapwcpscqqtdseweaf 
akngeii^isvgkvdviqepglsgrffvttlpaffhakdgxfrr 
yrgpgifedlqnyilekkwqsvepltgwkspasltmsgmaglfs 
I S GKI VniliHNY FTVTLGIPAW cs yvf fv i at lv fglsmdl vl * v 

ISQCNWDPPYRHVS * /RPSTNLGVHTAHTSEHLRL 


U 72 


304 


1085 


APGS KPVQ FMD FEG KTS FGMS VFNLSNAI MG SG I LGLAYAI^iAHT 
G VI FFLALLLCIALLS S YSIKLLLTCAGIAG I RAYEQLGQRAFG 
PAGKVWATVTCXHNVGAMSS YLFI IKSELPLVIGTFLYMDPEG 
D W FLKGNLL 1 1 1 VS VL I I LPLALMKHLG YLG YTSGLSLTCML FF 
LVS VI YKKFQLGLCYRATMKQQWES EAL VGTPQPRDS TAAVKAQ 
MFHS*LTGVLTQWPIMAFAFVCHPGGAGPS ITELCRAFQAQD 


6673 


1116 


1963 


lqiqthhthhgarvthlgshqllanagtmlcrqqsssMapafsq 
s vtcgps pc vrkqes atkclhigacgsdlwargmeqg+ g * glnv 
wlcpcvafhrgarpqaeeggarwnslvsspwippnp *hss igae 
navprp*oxj*kvnpsgqerqs\wvlplpvpgeplklpglpg*nk 

SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SJU»TPPLKACPAPRI!SDPCSSCLSCPCVTQKPRFSDTGW 
FGAGHCHSSCDFTRKGAAGGPG 


6674 


1 


440 


LEFDYMCQYDYVEVRDGDNRDGQI I KRVCGNERPAPXQS iGSSL 
HVLFHSDGSKNFDGFHAIYEEITACSSSPCFHDGTCVLDKAGSY 
KCACLAG YTGQRCENLLEERNCSDPG/ WPSQWVP ENNRG PWAYQ 
PTPC* IGTRVAFFLT 
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Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P= Proline. Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6675 


277 


1678 


GNWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCEMVL I DHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKKIQWKERNSKQSAQELKSLPB 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPPNEYSKPDGKGHV 
GTTATKK ID V YLPLHS SQDRLLPMTWTMASARVQDL I GL I CWQ 
YTSEGREP KLNDNVS AYCLHI AEDDGEVDTDF P PLDSNEP I HKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I LLKAVKRRKGSQKVSGS RADG VFEEDS QI D I ATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCA^FPGVLRKRAAPVDCLRPS 

nui rf AWl^JlwL.VA7/iMV^A>UjKo w L/onlvC w Ct 1oVjJJK.VB1 DPVTNQ 
KASTKFWIKQKPISIDSDI»LCAC\DIiAEE 


667* 


277 


1678 


GKWPTERMAFLDNPTI I LAHI RQSHVTSDDTGMCEMVL IDHDVD 
LBKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSIiKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTS EGREPKLNDNV5AYCLHI ABDDGE VDTDFP PLDSNE PI HKF 
GFSTIiALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDI ATVQDMLSSH 
HYKSFKVSMIH1?LRFTTDVQL/GCALFPGVLRKRAAPVDC1jRPS 

KASTKFWI KQKP I S IDSDLLCAC\DLABE 


6677 


277 


1678 


GNWPTERMAFLDNPT 1 1 LAHIRQSHVTSDDTGMCEMVLIDHDVD^ 

LEKIHPPSMPGDSGSEIQGSNGBTQGYVYAQSVDITSSWDFGIR 

RRSNTAQRIiERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 

GTTATKKI DVYLPIjHS SQDRLIiPMTVVTMAS ARVQDLIGI>I CWQ 

ytsegrep klndnvsayclhtaeddgevdtdfppldsnepihkf 
gfstlalvekysspgltskeslfvrinaahgfsliqvdntkvtm 
keiliikavkrrjcgsqkvsgsradgvfebdsqidiatvcdmlssh 
hyksfkvsmihrlrfttdvql/gcalfpgvlrkraapvdclrps 

ADTWRQEQIGCCGAACAAI»RS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE ! 


6678 


221 


86S 


GPSNQSSGSLSLIVTGCSSYWS*INDTCTILRVLSSNFGRQ*LR 
PFPCSQLPMSQGCLWHLDCCCPWVPYIPGQQWRKGRQRMRN *QS 
LLGSDQES VGLEDLCVFVNFLLHVLLGZjFP * PHELFLLPWDLG 
FLFPLLLG^CHCLVLPANLVSQAPQIGKtiSCRLQTHDIiEGSRN 
HHPLFLWGRWDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRSPGQNWVKTVDGWKRFLDEKSGSFVSDL 
SSY CNKE VYNK ENLFNS LNYD/ S CS Q E EKEGHAE * QNQNS \DFH 
QEKWIYVHKGSTKERHGYCTLGEAFKRLDFSTAILDSRRFNYW 
RLLELIAKSQLTSLSGIAQKNFMNILEKWLKVLEDQQNITLIR 
ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETILHWQQQLN 
NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGLF 


6680 


1498 


2951 


plctlplmpsalpgwagbrwekqwpla/ PGPGTWQTPVGS I SEE 

P\RKNEPDTHCPRGEARPEV*HLPKPHSPGSEGAEIQTSA*ALP 
/NQVSPPQPM*GAEENGDQRGGKEEAGEELHRSSSGLTAAPGF? 
EVHRNLQTFPGLPSRGGGP /GGAGTQGSWAPGEQPP/ SPLLPAS 
MQRSQAGLPG WEAGLVES PTHHI PALRPSGTNATGEAFPSTTCS 
SGP \ PAP PGP TGLRPGGG S S SGGHG * * PGLP VGKV\GALGAAQD 
PQSQGRGPTQGTVGTEMI*LSGLGSAKACPAARPAVP*LPSDPAS 
TIPKKGTRGFGEGPGVLQERNRMWGRAQGFTSADAAGTAPPGV 
♦LPAPLSQPPGATEPQVRACGMAPPSPGTSGRLVANGRHPGPQV 
AQGCPPGAGC WGSQPRGSQRCPRTYTHS PLGKGRAPCPRRCWH* 
WQDPPSSPRTGCLPGIPARQAYSAPRTRSRPGIRTGRAAYGFIR 
FQGGGGG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sp ond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 

to firah 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine , G=Glycine, 
H=*Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=:Glutamine , R-Arginine, 
S=Serine, ^Threonine, V«Valine, 
WoTryptophan, Y«Tyrosine, X=Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 


6681 


1169 


sTI 


XNyiYYNQQgRAFHBLK\EKLMSAPALGLPDLTKLFTLHVSEliE 
KM TVGVLTQTVGP WS R PGA YL S KQLDGVSKGWPPCPRALAATAL 
LAQEADBLTLRQNLNRKSPHA\WTLINTKGHH*LINARLTRYQ 
TLLCENPHKTlEVSNT/LNPATIiLLVTESPVKHNCLEVLDSVYS 
SRPNLRDHP* TS VDWSLYVDGSGFANPCKVTLKKETSPAPVTPR 
S 


6682 
6*83 


! 109 


1238 


TVLCGAMQVSSLNEVK1YSLSCGKSLPEWLSDRKKRALQKKDVD 
VRRRIBLIQDFEMPTVCTTIKVSKDGQY1LATGTYKPRVRCYDT 
YQLSLKFERCIiDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRI PKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAABNNVCD INS VHGLFATGT IEGRVECWDPRTRNR VGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLILSADSRIVKMWNK 
NSGKI FTSLE PEHDLNDVCLYPNS GMLLTANETPKMG I YYI P VL 
GPAPRWCSFLDNLTEELEENPESNE 




109 


1238 


tvlcgamqvsslnevkiysi^cgkslpewi^dr^^ralOkkdUd^ 

VRRRI EL I QD FEMPTVCTTIKVS KDGQYILATGT YKPRVRC YDT 
YQLS LKFE R CLDS E WTFE I LSDD YSKI VFLHNDR YI E FHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSBVYRLNLEQGRYLN 
PLQTDAAENNVCDINS VHGIiFATGTI EGRVBCWDPRTRNRVGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLP I KSVH FQDS LDL I L S ADSR I V KMNNK 
NSGKI FTSLE PEHDLNDVCLYPNSGMLLTANETPKMGI YYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6684 


111 


527 


GliRGGTSRGRAGREPBFAAGVLCWAG FCQS PCPPGGRGREAP A 

PP\SGRRHA*RPA*WLGGPGGDSGGREEGGS/GELQRAMESKMG 

ELPLDINIQEPRWDQSTFLGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


6685 


258 


1473 


KLLGDNFEGFCNKFELSDSENGSNS*QSPL\FDRLFDPDPQK7L 
QGVIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 
ECAWLGSLAMGTENNVKSLLDCHI I PALLQGLLS PDLKFIEAC 
LRCLRTlFTSPVTPEEIiYTDATVIPHLMALLSRSRYTQEYICQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQALKCFS 
VLAFENPQVSMTLVNVIiVDGELLPQ I FVKMLQRDKPIEMQLTSA 
KCLTYMCRAGAIRTDDNC I VLKTLPCLVRMCS KERLLEERVEGA 
ETLAYJj I EPDVELQRIAS I TDHLI AMLADYFKYPSS VSAITDI K 
RLDHDLKHAHELRQAAFKLYASLGANDEDIRKKVSLGEGRPPVL 


6686 


" 310 " 


927 


DSVTFDDLAVDFTPKEWTLLDP^RNLYRDVMLENYKNLATVGY " 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTS SGIQMIGSHNGGE VSDVKQCGDVSSEHS CZJCTHVRTQN 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF*LQLTLGKSFH*S IHT 


6687 


181 


915 


EAMLEAP YKKEEDEQQRKE VKKD YPSNTTS S TS NSGNETSGS ST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGS E SR 
SRDHRREDRVHYRSPPLATGEPVDNLSPEERDARTVFCMQLAAR 
IRPRDLEDFFSAVGKVRDVRIISDRNSRRSKGIAYVEFCEIQSV 
PLAIGLTGQRLLGVP I IVQAS QAEKNRLAAMANNLQKGNGGPMR 
LYVGSLHFNITEDMLRGI FEPFGKV 


6688 


1025 


1 


AEVPOTPRVFHKCPDSC^RFKFXJPiQL0PYIlXSF i S5kkPPI^P , ~ 

SEPGLPR/ S ATARMATAAAPPNSS IDLPSDSGMGFI S PAGDSLD 

LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 

STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 

VAVI CG S KG AGASGS ASCS SRAGKTTEATAASSMPSGTSS FSTC 

TMSELEELFSLPSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 

VC^I^IiADSDTGKLSDCQEVVTVGDSGGIiTCPEIiSLGRM*MSLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid/ £=* 
Glutamic Acid, Phenylalanine, G=*31ycine, 
H«Histidine, Iolsoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline,' Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, YoTyrosine, X=Unknown, *=Stop j 
Codon, /"possible nucleotide deletion, 
\»possible nucleotide insertion) 








SSAVI PGYSS S SDSRLNTVPTVDLLCP FQTKS ST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCPSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNSLTARQLAMSL*ATKF*RNACNPNCLSSKKSAL*LSLNQRF 
GGSASRKPGNISFNSQKCSALSYCCNFVI KPREVSVSS BNYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLIiLLLGSGQGP 
QQVGAGQTFE YLKREHSLS KP YQG VGTGS S SLWNLMGNAM VMTQ 
YI RLT PDMQS KQGAL WNRVPCFLRDWBLQVHFKI HGQG KKNL\ H 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQEQKKMIQAQE^ITliBDVAV 
DFTWEEWQLLGAAQKD LYROVMLENYSNLVAVG YQAS KPDALFK 
LEQGEQLWTI EIX3 IHSGACSDI WKVDHVLBRLQSESLVNRRKPC 
HEHDAFEN I VHCS KSQFLLGQNHD I FDLRGKS LKSNLTLVNQS K 
GYEIKNSVEFTGNGDSFLHANHERLHTAIKFPASQKLISTKSQF 
I S PKHQKTRKLEKHHVCSECGKAF I KKS WLTDHQVMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYI CSECG KGFIQKGNLI VHQRIHTGEKP YI CNEC 
/GKGFIQKTCLIAHQRF1ITER 


6692 


178 


939 


WIKEGEUSLWBRFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQ EGH S QG FNKLAETL R WCLNLG I LEVTVYAFS I EN FKRS KS EV 
DGLMDLARQKFSRLNEEKEKLQKHGVCIRVLGDLHLLPLDLQEL 
I AQAVQATKNYNKC FLNVC FAYTS RHE I SNAVREMAWGVEQGLL 
DPSD I SESLLDKCLYTNRS PHPDILIRTSGEVRLS DFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WIJKEGELSLWERFCANIIKAGPMPlOilAFIMDGNRRYAKKCQVE 
RQEGHS QGFNKLAETLRW CLNLG I LEVTVYAFS I ENFKRS KSEV 
DGLMDLARQKFSRLMEEKEKI^KHGVCIRVLGDLHLLPLDLQEL 
IAQAVQATKNYNKCFIiNVCFAYTSRHEISNAVREMAWGVEQGIili 
DPSD I SESLLDKCLYTNRS PHPD I LI RTSGEVRLS DFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHW 
NAAAGKFQVDTGAKFYRGMSLEYYGI EADDNPFFDLSVYFLP 


6695 


292 


813 


SLLLHIiAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVk 
E VHS LGQ I LPQDGLTAEAGP PBAQDP WGSPGISLPAAH I GFAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHVV 
NAAAG KFQVDTGAKFYRGMS LE YYG I EADDNP FFDLS VYFL P 


6696 


1 


782 


PR VRGRVGERWAFLS VPAAMS S EME PLLLAWS YFRRRKFQLCAD 
LCTQMLEKSPYDQAAWILKARALTEMVYIDEIDVDQEGIAEMML 
DENAIAQVPRPGTSLKLPGTNQTGGPSQAVRPITQAGRPITGFL 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 
TSPDG P F INLSRLNLTKYS QKPKLAKAL I E YI FHHENDVKTALD 
LAALS T EH SQYKD WW WK/DQ IEKCY YRVGM YRB AE KQ I KS S 




3 


782 


PPLFLRRLNSRALRPGSRJCVMAWPAS LSGQDVGS FAYLTIKDR 
I PQI LTKVIDTLH RHKSBF FE KHGEEG VEAE KKAI S LLS KLRNE 
LQTDKPFIPLVEKFVDTDIWNQYLEYQQSLLNESDGKSRWFYSP 
WLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
I IALCTHLQQLIRTI EDLD\ENQLKDEFPKLLQI S LWGEI SVDL 
SLXSGGESSSQNTNVIjNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 


668 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 ! 


EGELP / PARRVLPRAMTASAQPRGRRPGVGVGWVTS CKHPRCV 
LLGKRKGSVGAGSFQLPGGHLEFGETWEECAGRETWEEAALHLK 
NVHFASVVNSFIEKENYHYVTII^IKGEVDVTHDSEPKNVEPEKN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti3e"~ 
<A~Alanine, CoCysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F°Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=I»ysine, 
tj=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








ESKRI I YNHAFFFQBSKWSGGILQ 


6700 


1098 


1392 


TQCWRS STPGMRTHFRTQP / RLECGQGFSQQENGHCMDTNECIQ 
FPFVC PRDKPVCVNTYGS YRCRTNKKCSRG YBPNEDGTAC VERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPSPSPGLGPTSDKAAAPRTPKRRRLW 
RQRQ/HPAMLCYVTRPDAVLMEVEVEAKANGEDCLNQVC31RLGI 
IEVDYFGLQFTGSKGESLWLNLRNRISQ0^4IX3LAPYRLKLRVKF 
FVE PHLI LQEQTRH IFFLHI KEALLAGHLLCS PEQAVE LSALLA 
QTKFGDYNOOTAXYNYEELO^LSSATLNSIVAKHKELEGTSQ 
AS AEYQVLQIVSAMEN YGI EWHSVRDSEGQKLlj IGVGPEG I SIC 
KDDFS PINRIAYPWQMATQSGKNVYLTVTKESGNSI VLLFKMI 
STRAAS GL YRAITETHAFYRCDTVTSAVMMQYS RDLKGHLASLF 
LNENINI^KKYVFDIKRTSKEVYDHARRALYNAGVVDLVSRNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVLQEKLRKIiKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAQLQVGESAAHFCLQPHLSL 
LLTGSRSQVLAR 


6702 


397 


1971 


PIAKFLKIJ}LVMru:LP^ 

RAEAIiLCSRKATVVRDLVAVRMAEEQEFTQLCKLPAQPSHPHCV 

NNT Y RS AQHSQALLRGL LALRDSG ILFD WLWEGRHIEAHR I L 

LAAS CD YFKGMFAGGLKEMEQEE VLI HGVS YNAMCQ I LHF I YTS 

ELELSLSNVQETLVAACQLQIPEIIHFCCDFLMSWVDEENILDV 

YRLAELFDLSRLTE QLDTY I LXNFVAFSRTDKYRQLPLE KVYS L 

LS SNRLEVSCETEV YEGALL YHYSLEQ VQADQ I S I»HE P P KLLET 

VRFPLMEAE\njGjlIiHDKliDPSPLRDTVASALMYHRNESL^ 

SPQTELRSDFQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 

FTASLAPRMSNQGIAVLNNFVYLIGGDNNVQGFRAESRCWRYDP 

RHNRWFQIQSLQQEHADLSVCWGRYIYAVAGRDYHNDLNAVER 

YDPATNSWAYVAPLKRBVYAHAGATLEGKMYITOGRKGRIT 


6703 


45 


1244 


G VGPRAAAM P LELELC PGRW VGGQHPCF I IAEIGQNHQGDLDVA " 

KRMIRMAKECGADCAKFQKS ELE FKFNRKALER PYTS KHS WGKT 

YGEHKRHLEFSHDQYRELQRYAEEVGIFFTASGMDEMAVEFLHE 

LmrPFFKVGSGOTNNFPYLEKTAK/TRGWHSVLRDVCGVQLNDE 

TSSWDVI/SRVRTSKEKVIWVLVIjDYSGRPMVISSGMQSMDTMKQ 

VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 

IGYSGHETGIAISVAAVALGAKVIjERHITLDKTWKGSDHSASLE 

PGELAELVRSVRLVERALGSPTKQLLPCEMACNEKLGKSWAXV 

KIPEGTILTMDMLTVKVGEPKGYPPEDIFNLVGKKVLVTVEEDD 

TIMEE 


6704 


82 


1007 


TMNTRNRVVNSGLGASPASRPTRDPQDPSGRQGEIiSPVEDQREG 
LEAAPKGPSRESWHAGQRRTSAYTLIAPNINRRNEIQRIAEQE 
LANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQLQLMQSK 
YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQY KTAE FL/RQTEHRIARQKCLS KCCLW PTILN 
MGQKLGLQ\DSLKAEENRKLQKMKDEQHQKSELLELKRQQQEQE 
RAKIHQTEHRRVNNAFLDRLQGKSQPGGLEQSGGCWNMNSGNSW 
GI 


6705 


2 


786 


RLCRNSARVPdGWSASRStX5EGAGFIGPLRGPHPRAGGTGTSFT 
S YKRKGG I MSTIAAFYGGKS I L I TVATGFLGKELMEKLFRTS PD 
LKVIYILVRPKAGQTLQHRVFQILDSKIiFEKVIEVRPNVHEKlR 
AIYADLNQNDFAISKEDMQEI/LSCTNIIFHCAATVRFDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PCP VE PKK 1 1 DSLEW\LDDA I IDE I TPKLI RD W PNI YTYTK 


6706 


130 


531 


FT^SS^SkSQEMIJ3KIiNMLRNIXjHFCDITIRVQDKIFRAHKVVL 
AACS DFFRTKLVGO^u^DENKNVLDIjHHVTVTGF IPLLE YAYTAT 
LSINTENI IDVLAAAS YMQMFS VAS TCSEFMKS S I LWNTPNSQP 
EK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D-Aspartic Acid, E« 
Glutamic Acid, F-Phcnylalanine, Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
s=Serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSG IGYELQHFHWRKFKFEKKGPPSTCQBRLYESRSRWPCIS * " 
GMWVGWTAVNGSW*GGQLRCVCVCTSHSSDSTRSSQRASKCHS 
FF I LSQ* KT * S S WENWVFAKYSRI YS YGHS CS KGRGD * DFK*NV 
SQAR*SRFCGLCNPCGHCGLDINLRGGSSPWTDKHSCVHNNLLC 
NRRVFSLIiCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD * LPK*GPG YI QHFHCDSNI LCILYNIS FNL FS YSF * GVARYA 
C * RCH WY FEWL1* YNHCGD I LVACL* RRQL* SSQ 


6708 


115 . 


1729 


GEI IRVVHPHRPCKLALGSDGVRVTMESALTARDRVGVQDFVLL 
ENFTSEAAFIENLRRRFRENLIYTYIGPVLVSVNPYRDIiQIYSR 
QHMERYRGVSFYEEPPHLIAVADTVYRALRTERRDQAVMISVES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVI*EAFGN 
AKTLRNDNS S RFGKYMDVQFDFKGAPVGGKI LS YLLEKS R WHQ 

NHGERNFHI FYQLLEGGEEETLRRLGLERNPQS YLYLVKGQCAK 
VS S IKDKSDWKWRKALTVIDFTEDE VEDLT, <3 T a a cvt .mt pmt it 

FAANEESNAQVTTEKQLKYLTRLLSVEGSTLREALTHRKI IAKG 
E E LLS PLNLEQAAYARDALAKAVYS RTFTWL VG K I NRS LAS KD V 
ES PS WRSTTVLG LLD I YGFEVFQHNS FEQ FCINYCNE KLQQLF I 
ELTIiKSEQEEYEAEG I AWEPVQYFNNKI ICDLVEE KFKGI I \S I 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVS KRS RKEE E DLEAL TAHPflTT.nA KT3 iyyfvpt. dpdd 

PSPRLNASLSVHPEKDELILFGGEYFNGQKTFIiYNELYVYNIRK 
DT WTKVDI PS PP PRRCAHQAWVPQGGGQI; WVFGGEFAS PNGEQ 
F YHYKDLWVLHLATKTWEQVKSTGGPSGRSGHRMVAWKRQIi I LF 
GGFHESTRD YI Y YNDVYAFNLDTFTWS KLSPSGTGPTPRSGCQ\ 
IPSLPRAASSVYGGYSKQRVKKDVDKGTRHSDMF 


~ 4710 


158 


" 980 


RHKMr^YRVESSSGRAARKMRLALMGPAFIAA _ iGYIDPGNFATN 
IQAGAS FG YQLLWVWWANLMAMLI Q I LSAKLG I ATGKNIAEQI 
RDHYPRPWWFYWVQABI IAMATDLAEFIGAAIGFKLILGVSLL 
CGAVLTGIATFLILMLQRRGQKPLEKVIGGLLLFVAAAYIVELI 

fsqpniaqlgkgmvxpslptseavflaagvl\gatimphvi/yi 
whss1»tqhlhggs rqqrysatwwdvai amtiagfvn la i mataa 
selnfyghtgva 


" 6711 


3 


347 


VTECKTMTCKMSQLERNI *TMINTLHKYSVKLGHPDTtlHGEFK 
ELVRTDLHNILMKBNKNDQAI*HIMEDLDTNAHMQIIFKELIML 
MAMLT MS YHDNMHDADYGPGQQHRPG 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAJCDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSVVRLPPGENIDDWIAVHV 
VD FFNRINL I YGTMAERCS * TSCP VMAGGPR YE YRWQDERQYRR 
PAKLSAPRYMALLMDWIESLI 


6713 


2485 


3 


QARGS D SEDGE FE I Q AE DD ARARKLG PGR PLPTFPTS ECTS DVE 
PDTREMVRAQNKKKKKSGGFQSMGLS YP VFKG I MKKSYKVPTP I 
QRKTI PVILDGKDVVAMARTGSGKTACFLLPMFERLtKTHSAQTG 
ARALILSPTRELALQTLKFTKELGKFTGLKTALILGGDRMEDQF 
AALHENPDI I IATPGRLVHVAVEMSLKLQS VEYWFDEADRLFE 
MGFAEQLQEI I ART • PGGHQTVLFSATLP KLLVEFARAGLTEPVL 
IRLDVDTKLKEQLKTS FFLVREDTKAAVLLHLLHNVVRPQDQTV 
VFVATKHHAE YLTELLTTQR VSCAH I YSALDPTARKINLAKFTL 
GKCSTLI VTDLAARGLDI PLLDNVINYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLiDLHLFLQRSLTLARPIjKEPSGVA 
GVDGMLGRVPQSWDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELQRIiRb 
VDS I KNYRSRATI FE 1 NASSRDLCSQ VMRAKRQ KDRKAI AR FQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVBDIFS 
BWGRXRQRSGPNRGAKRRREEARQRDQEFYIPYRPKDFDSBRG 
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SEQ 
' ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

lOCa U J-UI1 

cor re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide"" 
(A-Alanine, C«Cysteine, 0=»Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
ScSerine, T=Threonine, V» Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, --Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








LS ISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRQQIiKWDRKKKR 
FVGQSGQEDKKKIKTBSGRYISSSYKRDLYQKWKQKQKID*S*L 
GRRRG ILTRRRPRTEEVGEARPLAQAGCI PGPHAPRHPLQAESA 
LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLPPPPAPMAHI PSGGAPAAGAAPMGPQYCVCKVELS VS 
GQNLLDRD VTS K5DPFCVLFTENNGRWIEYDRTETAINNLNPAF 
SKKFVLDYHFEEVQKLKFALFDQDKSSMRLDEHDFLGQFSCSLG 
TI VSSKKI TRPLLLLND KPAGKGL I T IAAQELSDNRVI TLSLAG 
RRLDKKDLFGKS DP FLE FYKPGDDGKWMLVHRTEVI KYTLDPVW 
KPFTVPLVSLCDGDMBKPIQVMCYDYDNDGGHDFIGEFQTSVSQ 
MCEARDSVPLEFECINPKIGQRKKKNYKNSGI 1 1 LRSCKINRD YS 
FLD YI LGGCQLMFTVG IDFTASNGNPLDPSSLHYINPMGTNE YL 
SAIWAVGQ1IQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNP 
TNPFCSGVDGIAQAYSACLP 


6715 


32 


493 


G PAGAESGSLHCLPATVQALAGAAHS PHGGQPPRRGPL IGSGMP 
GKPKHLGVPNGRM VLAVSDGELSSTTGPQGQGEGRGSSLS IHSL 
PSGPSS PFPTEEQPVAS WALS FERLLQDPLGLAYFTEFLKKEFS 
AENVTFWKACERFQQI PASDT 


6716 


1 


176 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAIWEQ 
HTVTLHRVSLCCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDYGGSGGPYSKQYAG 
YDYSQQGRFVPPDMMQPQQPYTGQIYQPTQAYTPASPQPFYGNN 
FEDEPPLLE BLG INFDH I WQKTLTVLHPLKVADGS I MNETDLAG 
PMV F CLAFGATLLLAGK I Q FG YVYG I S AIGCLGM F CLLNLMS M T 
G VS FG CVAS VLG YCLLPM I LLSS FAVI FSLQGMVG 1 1 LTAG I IG 
WCS FS ASKI FISALAMBGQQLLVAYPCALLYGVFALISVF 


' *718 


290 


599 


KQSS TVPGTI LPS LKWHNSGLCKFPETGGKMTTFKEGLTFKDVA 
VI FTEEELGLLDPVQRNLYQDVMLBNFRNUjSVGHHPFKHDVFIj 
LEKEKKL0IMKTATQ 


6719 


1 


691 


PrRPEEQDREDGKCHKMtiMNPrSGNtNCDPIAMSQCSS6HGCET 
DLDS DDDKI E KPNNFMKDS AS QDNGLS RXI SR KRVCS3DSVS3 L 
QWKKSSKARTGLLR I TRRCAATAANK I KLMSDVEDVSLENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDSEGSTKVLSQALNGDSDSEDMLNSEHKHRHTNIHKIDAPSK 
RKSSSVTSSG 


6720 


3 


822 


hbvaeeaggtvypqrgtmpgtkrfqhvietpe'pgkWELtgyeaa 
vpiteksnpltqdi^kadaenivrllgocdaeifoeegoalsty 

QRLYS ES I LTTMVQVAGKVQEVLKE PD GGLWLSGG GTSGRMAF 
LMSVS FNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
IEEL KKVAAG KKRVI VIG I S VGLS AP FVAGQMDCCMNNTAVFLP 
VAjV^iw^vSMARHPFPPPRILRSLTVFPSLRAPHYQITSLLFSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVI ETPEPCjKWELT'GYEAA ' 
VP ITEXSNPLTQDLDKADAENI VRLLGQCDAEI FQEEGQALSTY 
QRLYS ES I LTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPLYT YLI AGGDRS WASREGTEDSAXHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VL VGFNP VSMARHPFPPPR ILRSLTVFPS LRAPHYQI TSIXFSM 
SWTLISE 


6722 


1 


390 


RSWSKRTWQALPMAVLFLLtFLCGTPQAADNMQAIYVAL^SAVE 
LP CP S PS TLHGDEHI»S WFCS PAAGSFTTLVAQ VQVGRPAPDPGK 
PGRESRLRLLGNYSLWLEGSKEEDAGRYWCAVLGQHHNYQNW 


4723 


173 


659 


VCQYCTARMADPGISAGQFVAVVWDKSSPVEALKGLVDKLQALT 
GNEGRVS VENT KQLLQSAHKESSFDI I LSGLVPGSTTLHSAEIL 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
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SEQ 
ID 
NO: 


1 Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Higtidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
PoProline, Q=Glutamine, R*Arginine, 

SaSerin© T-ThfAnm'na i r t »_ 7 _ _ 

i i-inreonine, V s Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6724 


173 


659 


VE VKELQRe pijTpbevqs vrehlghesdnl 1 

v v-w x v- iAW"uu;rbi5AbQF VAVVWDKSS PVEALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AH KES S FD I ILSGLVPGSTTLHS AEI L 
AE I ARILRPGGCLFLKEPVBTAVDNNSKVKTASKLCS ALTIiS GL 
VEVKELQRE PLTPEE VQS VREHLGHESDNL 


6725 


356 


| 722 


RRRTPPV1 IiATMDDDLMLALIUiQEEWNIjQEAERDHAQESI^LVD 
ASWELVDPTPDLQALFVQFNDQPFWGQLEAVEVKWSVRMTLCAG 
I CS YEGKGGMCS IRLSEPLLKLRPRKDLVEVFFV 


6726 


98 


IX'* 


HLgKMERKINRREKEKEYEGKHNSLEDTDQGKNCKSTLMTLNVG 
GYLYITQKQTLTKYPDTFLEGIVNGKILCPFDADGHYFIDRDGL 
LFRHVLNFLRNGELLLPEGFRENQLIiAQEAEFFQLKGLAEEVKS 
RWEKEQLTPRETTFLE ITDNHDRSQGLRIPCNAPDFIS KIKSRI 


6727 


1 


831 


FRGMGDERPHY XGKHCTPQKYDPTFXGPIYMhG^TDlMCCVFLL 
LAIVGYVAVGI IAWTHGDPRKVI YPTDSRGEF03QKGTKNENKP 
YL FYFNIVKCAS PLVLLEFQCPTPQI CVEKCPDRYLTYLNARSS 
RDFE YYKQFCTOGFKNNKG VAEVLRDGDCPAVL I PS KPLARRCF 
PA IHAYKGVLMVGNETT YEI5GHGSRKNI TDLVEGAKKATOVIjEA 
RQLAMRIFEDYTVSWYMDIISLGIAMAMSLLFIILLRFLAGIMG 
RGMI IMGILVLGY 


6728 
$72$ 


486 


935 


r^iHijKaijAUSHljSWKMFLVGI,TGGIASGKSSVIQVF(iQIjGCA 
VI DVD VMARHWQ PG YPAHRRI VEVFGTEVLLENGD INRKVLGD 

LIFNQPDRRQLLNAITHPEIRKEMMKETFKYFLREPRTSPRGKK 
HVPSALKEADSLMRRDT 


$730 


259 


1191 


VGLTGAQSGRTAS M3RDQRAVAG PALRR WLLLG TVTVG FLAQS V 

IAGVKKFDVPCGGRDCSGGCQCYPEKGGRGQPGPVGPQGYNGPP 

GLO^FPGLQGRKGDKGERGAPGVTGPKGDVGARGVSGFPGADGI 

PGHPGQGGPRGRPQYDGCNGTQGDSGPQGPPGSEGFTCPPGPQG 

PKGQKGEPYALPKEERDRYRGEPGBPGLVGFQGPPGRPGHVGQM 

GPVGAPGRPGPPGPPGPKGQQGNRGLGFYGVKGEKGDVGQPGPN 

GIPSDTLHPIIAPTGVTFHPDQYKGEKGSEGEPGIRGISLKGEE 
GIM 




784 


1015 " 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEKAE 
RKFKE VAEA YE VLSNDEKRD I YDKYGTEGLNE F 


6731 


1 


446 


G IRKRIjHGAVVPRVEV^CP WETRfiiEGVHLERPTS PL KNNDEGS 
uux * *wijiJ5»AV5DSASKS CVPSRNCLDLYEEILTEEGTAKEATY 

ndlqveygkcqlqmkei^kkfkeiqtqnfslinenqslkknisa 

LIKTARVEINRKDEEI 


6732 


102 


1205 


GRWQRRPPPPSPPLWCLQPGGGSDPQQLTQLRHCLSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LGS I KP SSRSTKATSTTMAGDGRRABAVRBGWGVYVTPRAP IRE 
uriuarcjjH t*y NUbb b DAPAYRTPPSRQGRRE VR PS DE P PEVYGDFE 
PLVAKERS PVGKRTRLEE FRS DS AKEE VRESAYYLRS RQRRQ PR 
PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 
DEASSO^DLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 

YEATSVC^KVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 


6733 


613 


1311 


RSCRgVGMRSRNG^GESASDGHISCPkPSltGNAGEKSXs^Al? - 

KKXKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLBLSKEDLI 

QLLSIMEGELQAREDVIHMLKTEKTKPBVLEAHYGSAEPEKVLR 

VLHRDAILAQEKSIGEDVYEKPISELDRLEEKQKETYRRMLEQL 

LLAEKCHRRrVYELENEKHKHTDYKNKSDDFTin^EQERERLKK 

LLEQEKAYQARKE 


6734 


189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAVHFTWEESiQDLDD 
AQRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPWIVEE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 or? A h ■? on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticie 
(A«*Alanine, C«Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V»Valine, 
w-irypcopnan, Y»Tyrosine, X=unknown, *-Stop 
Codon, ./-possible nucleotide deletion, . 
\«=possible nucleotide insertion) 








TLNtiRLSGGSKKQVFSGICHRSLVBLQEVHIiV 


6735 


280 


558 


KSRRAGVTKMSNPPLKQVFNKDKTFRPKRKFEP6TQRFELHKKA 
QAS LNAGLD LRIAVQL P PGE DLNDWVAVHWD FFNR VNLI YGT I 
XDGCT 


6736 


195 


808 


MNYELNFKREMPNI KSLGLTNLNFLLKRIiSS VLPLI TDYVYFEN 
S SSNP YL IRR I EELNKTASGNVEAKWCFYRRRD I SNTL I MLAD 
KHAKEI EBES ETTVEADLTDKQKHQLKHRELFLSRQYESLPATH 
IRGKCSVALLNETESVIiSYLDKEDTFFYSLVYDPSLKTLLADKG 
EIRVGPRYQADIPEMLLEGTFFCVFAVL 


o / J / 


ISO 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSRI^ESYRPDTDLS" 
REDTGCNLQHISDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 
RE KRKSLF I NHHPPGQ I ARKYS S CSTI FLDDSTVSQPNLKYT I K 
CVALAI Y YH I KNRJD PDGRMLLDI FDENLH PLS KS E VP PDYDKHN 
PEQKQIYRFVRTLFSAAQLTAECAIVTLVYLERLLTYAEIDICP 
ANWKR IVLGAI LLASKVTODQAVWWVDYCQI LKDITVEDMNELE 
RQFLEIaLQFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAHKLE A I S R LCE DKY KDLRRSARKRSAS ADNLTL PRWS PAI I S 


6738 


148 


653 


CACAEQPARAEVGAATALPVRWASGEMAPSGSIAVPIAVo\7LLL 
WGAPWTHGRRSNVRVI TDENWREIiliEGDWMI EFYAP WCPACQNIi 
QPEWESFABWGEDLEVNIAKVDVTEQPGLSGRFlITAIiPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEWKLQkXIJ^TEkftClAL 
ANKESSSES F ISRLLAIVADLYEQEQYSDLKI KVGDRHI SAHKF 
VIAARSDSWSLANLSSTKELDLSDANPEVTMTMLiRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 
TAB ELNASTLMNYCAE 1 1 ASHWVS EVEGVNKAL 


6740 


3 


631 


SWPDMAEEEVAKXSKHLMLLRQEYVKJLrQKKIiAET^ 
ANKESS5ESFISRLLAIVADLYEQEQYSDLKI KVGDRHI SAHKF 
VLAARSDSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEF 
REDDVFLTELMKIJUTOFQLQLIJIERCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAEI IASHWVSEVEGVNKAL 


6741 


141 


960 - 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNPI ISYDGVNKNIASVGFHEDGRWMYTGGEDCTARI 
WDLRSRNLQCQRIFQVNAPINCVCLHPNQAEIjIVGDQSGAIHIW 
DLKTDHNEQLIPEPEVSITSAHIDPDASYMAAVNSTLVPFSCLL 
PLAIGILQE<3EFESIJU^RGI*LFIACQGNCYVWNLTGGIGDEVTQ 

T TDVTPTTJ 


6742 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVIIATAGYDHTVRFWQA 

hsgictrtvqhqdsqvnalevtpdrsmiaaavqpvslgyqhirm 

YDLNSNNPNPI IS YDG VNKNIAS VGFHEDGRWMYTGGEDCTAR I 
WDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIW 
D LKTDHNEQL I PE P E VS I TS AHI DPDAS YMAAVNSTLVPFS CLL 
PLA I G I LQEGEFESLARRGLLFLACQGNC YVWNLTGG IGDE VTQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGD PNPSAAPTSTCAPRKMP KRI SIS KQLAS VK 
ALRKCSDLEKAIATTALIFRNSSDSDGKLEKAIAKDLLQTQFRN 
FAEGQETKP KYRE I LSELDEHTENKLDFEDFMI LLLS I T VMSDL 
LQNIR ' 


6744 


95 


1343 


RTPARNRCAGCE VLS R FS S PNKAS SFALQSAGGGL PA VRALRRD 
RQKVSTVG YGMDE VEQDQHBARLKELFDS FDTTGTGS LGQEELT 
DLCHMLSLEEVAPVLQQTL^DNLIjGRVHFDQFKEALILILSRT 
LSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VTVIEPLDEEARPSHIPAGDCSEHWKTQRSEEYEAEGQIiRFWNP 
DDLNASQSGSSPPQDWIEEKLQEVCEDLGITRDGHLNRKKLVSI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K^ysine, " 
L«Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q°Glutaraine, R=Arginine, 
S=»Serine, T»Threonine, V« Valine, 
WaTryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 








PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERILDTWQEEGIENSQEI LKALDFGLDGNINLTEL 
TLALENELLVTKNS IHQACI 


S745 " 


1 


588 


TPRDQ^WAQRRRWLIiGCASWESWEAAIAAGPGbPSSTARQQNNP 
AAGTEC FAAVWARGTAMGS VLSTDSGKSAPASATARALERRRDP 
ELPVTS FDCAVCLEVLHQPVRTRCGHVFCRSC1ATSLKNNKWTC 
PYCRAYLPSEGVPATDVAKRMKSEYKNCAECDTLVCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFQTTEI "" 
S LWTWAAI QAVEKKMESQAARLQSLEGRTGTAEKKIiADCEKMA 
VEFGNQL£GKWAVLGTLI£BYGLI£RRLENVENLLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEELGLLDIiAQRKLYRDVMLENFRNLLSVGH " 
QPFHRDTFHFLREEKFWMMDIATQREGNSVYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAWFTEEELGLLDPAQRktYRDVMLENFRNL 
LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFPKEGDVPC 
QI EARLS ISXVQQXPYRCNECXQ 


6749 


95 


719 


RREVKGGDGVCPRARGSPQSQQFPSCMGGEGIjQQSGEAUXSAM " 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLEKEFDKAF 
VDVDLLLGE I DPDQAPI TYEGRQKMTSLSSCFAQLCHKAQS VSQ 
INHKLEAQLVDLKSKLTEI'QAEKWLEKEVHDQLLQLHSIQXjQL 
HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


42ft 


SCESRRPGAKl^VWASGALPRDTTGLGSEQ PSGDVAQSNRATMGT 
TAPGPIHLLBLCDQKLMEFLCNMDNKDLVWLEEIQEEAERMFTR 
EFSKEPELMPKTPSQKNRRKKRRISYVQDENRDPIRRRIiSRRKS 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCI IQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEGYNVCIFAYGQTGAGKS YTMMGKQEKDQQGI I PQLCEDL 
FSRINDTTNDNMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVRE 
HPLLGPYVEDLSKLAVTSYNDIQDLrmSGNKARTVAATNMNETS 
SRS HAVFNI I FTQKRHDAE TNITTE KVSKI S LVDLAGS ERADST 
GAKGTRLKEGANINKSLTTLGKVI SALAEMDSGPNKNKKXKKTD 
FI PYRDSVLTWLLRENLGGNS RTAMVAALSPADINYDETLSTLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDLLYAQGLGD 
ITDMTNALVGMSPSSSLSA1»SSRNV 


6752 


24 


1834 


RNCVPPLGC YRSRVKFHSDI KMQYSHHCEHLLERIjNKQREAGFL " 
CDCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 
VKADGFQKLIiEFI YTGTLNLDS WNVKEIHQAADYLKVEEVVTKC 
KIKMEDFAFIANPSSTEISSITGNIEIiNQQTCLLTLRDYNNREK 
SEVSTDLIQANPKQGALAKKSSQTKKKKKAFNSPKTGQNKTVQY 
PSDILENASVELFLDANKLPTPWEQVAQINDNSELELTSWEN 
l trvjvju l VH l V i^iU*KKGKSQPNCAJjKEHSMSNIAS VKS PYEAE 
NSGEELDQR YSKAKPMGNTCG KVFSEAS S LRRHMR IHKGVKP YV 
CHLCG kaftqcnqlkthvrthtgekp YKCELCDKGFAQKCQLVF 
HS RMHHGEE KP YKCD VCNLQ FATSS NL K IHAR KH S GE KP YVCDR 
OGQRFAQASTLTYKVRRHTGEKPYVCDTCGKAFAVSSSLITHSR 
KHTGEKPFICELCGNSYTDIKNLKKHKTKVHSGADKTLDSSAED 
HTLSEQDS IQKSPLSETMDVKPSDMTLPLALPLGTEDHHMLLPV 
TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


2 


1305 


VPSLP YP PQKWAHTE FTTSSDSETANGI AKPDP VMPGGEEKAS 
PFGIKLRRTNYSI4RFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
S ARGEKEMEGVALKHGPS LPQERKQAPS TRRDSAE PS S SRSVPV 
AHPGPPPASSQTPAPEHDKAANKMPtAQKPALAPKPTSQTPPAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A^Alanine, C*Cysteine, D«Aepartic Acid, E= 
Glutamic Acid, F« Phenylalanine , G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
« ucu^Ain;, w^wetnionine, N=Asparagme, 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
xwwwti, / -^usDiBAB nucj,eociaa uej.etion # 
\=possible nucleotide insertion) 








PliSKLSRPYLVELLSRRAGRPDPEPSEPSKBDQESSDRRPPSPP 
GPEERKGQKRDEEEEATBRKPaSPPLPATQQEKPSQTPEAGRKE 

kpmlqsrhsldgskltekvetaqplwitlalqkqkgfreqqatr 
eer kq areakqaekls kenvs vs vq pgss s vsragslhks talp 
eekrpetavsrlerreqlkkantlptsvtveisysspaaplvkb 
vskrfsspddapvssepawlalakrkakawsdcpliik 


6754 


2 


413 " 


FVRRRRRRLGGPEVNTMSSLHKSRIADFQDVLKEPSIAIiEKIjRE 

lsfsgipcegglrclcwkillnylpleraswtsilakqrelyaq 
flremiiqpgiakanmgvsredvtfedhplnpnpdsrwntyfkd 

NEVLL 


S755 


298 


1343 


pglqlqvaleadwfldmp(^rrgpsrqqlsK5aLpslqtlvggg ■ 
cgngtglrnrngsaiglpvppitalitpgpvrhcqipdlpvdgs 
ijlfeflffiyllvalfiqyiniyktvwwypynhpasctslnfhl 
idyhiaafitvmlarrlvwaliseatkagaasmihymvlisarl 

VLLTLCGWVLCWTLVNLFRSHSVLNLLFLGYPFGVYVPLCCFHQ 

dsrahllltdynywqheaveesastvgglakskdflslllesl 
keqfnnatpipthscplspdlirneveclkadfnhrikevlfns 
lfsayyvaflplcf vkvsg y ltpmcfldlcvny inwvflv 


675* 
6757 " 


180 


754 


IJSKALGSLPI^i pvswGsLrtlkyqqqpi^pkvllcqtrvqchd^ 

lrslqpqppglkqsfclrvlglqtgattpglrdltckbliilte 

reaqkrkkrkekesgmaltqgpltfrdvaiefsqeewksldpvq 

kalywdvmlenyrnlvflgkdnfalevkicprvflyflcclswe 

pfhylteteallthk 




2 




nsrveapeahsresqgsdamrkhlswwwlatvcmllfshlsavq 
trgikhrikwnrkalpstaqiteaqvaenrpgafikqgrkldid 

FGAEGNR YYEANYWQFPDG I HYNG CSEANVTKEAFVTGC I NATQ 
AANQGE FQKPDNKLHQQVLW 


6758 


1 


1008 


rtovjf nuruKKfKJJKAf WJjJ?A1UiLRGVIiAVWVSI»SAIiGPGS FCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPSFRRNMANNSPALTGNSQPQHQAAAAAAQQQQQCGGGGATK 
v ooiujun v JjPJjWGNE KTM WLNPMILTN I LSS P YFKVQLYELK 
TYHEVVDEIYFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGGI 

vstafcllyklftlkltrkqvmglithtdspyiralgfmyiryt 
qpptdlwdwfesflddeedldvkagggcvmtigemlrsfltkle 
wfstlfpri pvpvqknidqqi ktrprki 


6759 


1 


513 


RKHNFHSLDGTSTRAFHPQTGliPLLSSPVPQRKTQSGCFDLDSS 
LLHLKS FSSRS PRP CLNIEDD PDIHE KPFLSS SAP P I TSLSLLG 
NFEESVLNYRFDPLG X VDG FTAEVGASGAFCPTHLTIi PVEVS FY 
S VS DDNAPS PYMGV I TLESLGKRG YRVPPSGTIQWCVL 


6760 


239 


606 


VfcS KKKGLSAEEKRTRMME I FSETKDVFQLKDLEKIAPKEKGTT - ' 

AMSVKEVLQSLVDIXSMVIX^ERIGTShfYYWAFPSKALHARKHKLE 

VLESQLSEGSQKHASLQKSIEKAKIGRCETEERT 


67*1 


29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCdiHCTQTQVAQDCPS - " 
SSSSVQRCELSLFQSLHTMTSKKLVNSVAGCMDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAG AVFTSPAVG S I LAA I RAV AQAGTVGTLL I VKNYTG D 
RLN FG LAR EQARAEG I PVEMWIGDDSAFTVLKKAGRRGLCGTV 
LIHKVAGALAEAGVGLEEIAKQVNVVTKAMGTLGVSLSSCSVPG 
SKPTFELSADEVELGLGIHGEAGVRRIKMATADBIVKLMLDHMT 
NTTNASHVP VQPGSSWMMVNNLGGLS FLELGI IADATVRSLEG 
RGVKIARALVGTFMSALEMPGISLTLLLVDEPLLKLIDAETTAA 
AW PNVAAVS I TGRKRSRVAPAEPQE APDSTAAGGS AS KRMALVL 
ERVCSTIJjGLEEHLNaLDRAAGIX5DOGTTHSRAARAIQEWLKEG 
PPPASPAQLLSKLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKT 

slpawsaamdagleamqkygkaapgdrtmldslwaagqbl 
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Amino acid segment containing signal peptide 
(A«Alanine, C*Cysteine, D»Aspartic Acid, E«= 
Glutamic Acid/ FwPhenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K*I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R»Arginine, 
S*Serine, TVrhreonine , V=Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLC^AGABAIUiPVPVAGERAGGGAMWFMYLLSWLSLFI 
QVAPI TLAVAAGLYYIiAELIEE YTVATSRI I KYMI W FS TAVL IG 
LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LVWNHYIiAFQFFABE YYPFSEVLAYFTFCLWI I PFAFFVSLSA 
GENVLPSTMQPGDDWSNYFTKGKRGK 


6763 


2 " 


760 


SGPD F PGRRFRGCCCVRPPAGAGME LGGHWDMN S APRX*VSETAE 
RKOEGKTGTEAEAADSGAVGARRFIiljPli YLGOPLOT ,Prt VQMUXTD 
L LSLHVKS LGAS PTVAG I VGS SYG ILQLFSSTLVGCWS DWGRR 
SSIiLACILLSALGYLLLGAATNVFIiFVLARVPAGIFKHTLS ISH 
ALLSDWPEKERPLV1GHFNTASGVGFILGPWGGYI>TELEDGF 
YLTAF I CFIiVF I LNAGLVW FFPRREAKPGSTE 


6764 


80 


438 


LKKMDXMMLSVRNLFEQLVRRVEILSEGNBVQFIQLAKDFEDFR ' 

KKWQRTDHELGKYKDLLMKAETERSALDVKLKHARNQVDVEIKR 

RQRAEADCEKLERQIQLIREMLMCDTSG31Q 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL " 
RGSVhVSZKLSGSMDGlVTEVAVGVKRGSDEhLSGSVLSSPNS 

NMQ^MV\7T2VW3WnQVirT?lfr3I?nWMTViaoCD\JT EITTJVT nr*mrmTnms • 

wnovji'i v v ft.R.r n.vlSUXUnUwil^KVJjnXKJUji^SVT j 
VXRLGLPFGKVTNILMLKGKNQAFbELATEEAAITNGNYYSAVT 
PHLRNQ j 


6166 


1 


1287 


EGGSFKASLTWLWPLGEMKLHCEVEVISRHLPALGLRNRGKGVR 1 
AVLSLCQQT3RSQPPVRAFLLISTLKDKRGTRYELRENIEQFFT | 

VDTPVSTLTPVKTSEFENFKTKMVITSKKDYPLSKNFPYSLEHL ' 
QTS YCGLVRVDMRMLCLKS LRKLDLSHNH I KKLPATIGDL IHLQ 

LOELKNLKLDDNELIOPPCKTfiOT. IMT^RITT A&PKT VT.Dt?T. DC p t? 
RNI>SLEYLDLFGNTFEQPKVLPVIKLQAPLTLLESSARTILHNR 
IPYGSHIIPFHLCX5DLDTAKICVCGRFCXNSFIQGTTTMNLHSV 
AHTWLVDNLGGTEAP 1 1 SYFCSLGCYVNSSDI 


6767 


336 


919 


APMI CLCSSDLQFR YKEAFLRDRGLQ IGYCS VDDDP RMKHFLNV 
GRLOSDNEYKKDFAKSRSQFHSSTDQPGLLQAKRSQQIASDVHY 
RQPLPQPTCDPEQIjGLRHAQKAHQLQSDVKYKSDLNLTRGVGWT 
PPGSYKVEMARRAAELANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATEILHVKKKKALLL 


6768 . 


2 


363 


PGSTISCiTLLSEGSLPLCMQVACXSBEKHRAPTMKTIiRARFKKTE' ' 
LRLSPTDLGSCPPGGPCPIPKPAARGRRQSQDWGKSDERUJQAV 
ENNDAPRVAALI ARKGLVPTKLDPEG KSAFHL 


6769 


284 


396 


MSTPDFS TAENNQELANE VS CIjKAMLTLMLQAMGQAD 


6770 


1 


3*7 


QRNYQVIWSSTMAKLHDYYKDEVVKKL^tEFNVNSVMQVPRVEK 
ITLNMGVGEAXADKKLLDNAAADLAAISGQKPLITKARKSVAGF 
KIRQG YP IGCXVTLRGERMWEFFERL ITIAVPRIRD FRGLSAKS 


6771 


3 


378 


APAGTLAMrGKSVKDVDRYQAVLANIiLLEEDNKFCADCQSKGPR 
WASWNIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCM 
QEMGNGKANRLYEAYLPBTFRRPQ I DP YLFWSNLEG 


6772 


1 


1406 


AAAFLGGMT VNG F INT VI TS L \ ERR YDtHS tf$SQh jl AS S YD I AA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P * *GWKLDAGVRTC PANPR\ P VCAG \HTSGLSRYQLVFMLGQ FI» 
HGVGATPLYTLGVTYLDENVKSSCSPIYIAIFYTAAILGPAAGY 
LIGGALLNI YTEMGRRTELTTESPLWVGAWWVGFLGSGAAAFFT 
AVPILG YPRQL PGS QR YAVMRAAEMHQLKDSSRGEASNPDFGKT 
IRDLPLSIWLLLKNPTFILLCLAGATEATLITGMSTFSPKFLES 
QFSLSAS EAATLFGYLWPAGGGGTFLGGF FVNKLRLRGSAV I K 
FCLFCTWS LLG I LVFSLHCPS VPMAGVTAS YGGSLLPEGHLNL 
TAPCNAACSCQPEHYSPVOGSDGUMYFSI^IAGCPAATBINWG 
QKVYRDCSCI PQNLS SGFGHATAGKCTST 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D»Aspartic Acid, B= 
Glutamic Acid, F«Fhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MeMethionine, NsAsparagine, 
P=Proline, Q*Glut amine, R^Arginine, 
S=Serine, T=Threonine, V.Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6773 


1 


630 


PWEAPKEHKYKAEEHTVVLTVTGBPCHFPFQYHRQLYHKCTHKG 
RPGPQPWCATTPNFDQDQRWGyCLEPKKVKDHCSKHSPCQKGGT 
CVNMPSGPHCLC PQHI»TGNHCQKEKCFEP QI»LRFPH KNEI WYRT 
E QAAVARCQCKGPDAHCQRLASQACRTNP CLHGGRCLE VEGHRL 
CHCPVGYTG?FCDVGE*G3GASRRPAPRWDGLAR 


6774 


146 


389 


LTEIi3DQQY?LFFILSS/WVPTFLSMDVDGRVIKADSFSKIlj5S 
GLRIGFLTG PKPLI ERVILKIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


TCPSQIiRVLTARGGRRAP^PO!fitoTLVf 1 Al,TT?K , »n»JD qIjpt r dmmc 
GRPETMENLPALYTI FQGEVAMVTDYGAF1 KI PGCRKQGLVHRT 
HMSSCRVDKPSEIVDVGDKVWVKLIGREMKNDRIKVSLSMKWN 
QGTGKDLDPNNV\SLSKKRGGGDPSR1TLGRRSPLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLLH 
j LNGT FPKTS DADMEPCVDGKVYDRIS FSS TI VTEWDLVCDSQS L 

CAALAPTFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRPQAM 
GI TLGMCPSG I AFMTLAGLAFAI RDWHILQLWS VPYFVI FLTS 
SWLLESARWLIINNKPEEGLKELRKAAHRSGMKNARDTIiTLEIL 
KSTMKKELEAAOKKKPFIjGERIiHMPNIC!in>T<5Y.T,PPTVT?axTTrva 
YFGLNLHG/ LKHLGNNVFLLQTLFGAV/ TPPGQLVLHLGHWGSG 
RVSSRGRVNCLGLFVLQVW 


6777 


119 


63 


cffhgpawrdcevratfakkqgqsgiisciAfspaqplyacgsy 
grslglyawddgspiallgghqggithlcfhpdgnrffsgarkd 
aellcwdlrqsgyplwslgrevttnqri yfdldptcqflvsgst 
sgavsvwdttcpsr^kpepvlsflpqkdctngvslhpslpllg 
hclpvsvcflsptesggrrrgagpslgsprrhvhlecriiqlwwc 
gggarlqhp* * sprarkgr 


6778 


311 


80S " 


IQS I TDESRGS I RRKNPANTRLRIjNVP ^ BBTAGDSE /ERS PEEE 
VQADPRIRSASPKCPTSSPFPKGRSPEX3EGET\DPEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSKTVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


677$ 


2 


535 


RAI^RQPRLUU^GIEPESMAISEPIKGSRKPC^NKEELALKKP 
MAKCAWKGPREPPQDARAEAESPGGASESDQDGGHESPPKKKAV 
AKVSAKNPAPMRKKKKVSLGPVSYVLVDSEDGRKKPVMPKKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 
V 


6780 


3 


403 


HE VNDNKPE ININIiMS PGKEEI& YI FEGDP 1 DTFVALVRVQDKD 
SGLNGE I VCKLHGHGH FKl^KTYENNYI* ILTNATLDREKRS E YS 
LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRxEFVISE 
K 


6781 


1 


1269 


APTRPVFPTLQDIjSSS KE PSNSLNLPHSNELCS S L VHPEItS bvs 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT 
SAALPTHIiQSALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 
TL^INSSSIIQVNIK^SQPSTIPAAPLTTNSGLMPPSVAVVGPL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKELNPDEASPQVNTSADQNTLPSSQ 
STTM VS P LLTNS PGSSGNRRSPVS SS KGKGKVDKIGQlIfLTKRC 
KKVTGSLEKGEEQxG^UX3ETE^GLOTTApGLMGTEQLSTELDS 
ICTPTPPAPTLLKI'ITSSPVGPGTASAGPSLPGGALPTSVRSIVTT 
LVP SEL I SAVPTTKSNHGG IAS ESLAG 


6782 


3 


1327 


RkPTViRIPAkP^KCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTI PTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFIWSSSDMDLQKKQSNLATQLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYP^LSVPHGXANEDIVSQ 
NPGELS CKRGDVLVMLKQTENNYLBCQ1CG EDTGRVHLSQMKLI T 
PLDEHLRSRPNPFS PPKAPSHAQKP VDSGAPHA WLHDFPAEQ V 
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Amino acid segment containing signal peptide 
(A-*Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R^Arginine, 
S=Serine, ToThreonine, V«Valine, 
W-Tryptophan, Y=Tyrooine, X»Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








DDLNLTSGEIVYLLEKIDTDWYRGNCRNQIGIFPANYVKVIIDI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELSFSEGEI I 1 
LKEYVNEEWARGEVRGRTG I FPLNFVE PVEDYPTSGANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 

I i 


6783 


3 


1750 


S YHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPLVLKAATA 
TMPAS WG QRPT I AMVTA INS Q KAVLS TD VQNT P VNLQTS S KVT 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPOLIQRPVMLTKFTPTTL 
PTSQNS IH PVRWNGQTAT IAKTFPMAQLTSI VI ATPGTRLAGP 
QTVQLSKPSLBKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEE I QSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 
SHPDS PENEKTETTFTF PAPVQPVSLPSPTSTDGDIHEDFCS VC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMW1CPRCX3DQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS 1 S K04EMKNTI LARQKBMHSSLEKVKQLI RL I H 
G I DLS KPVDS EATVGA1 SNGPDCTPPANAATSTPAPS PSS QS CT 
ANCNQGEETK 


6784 


3 


1750 


S YHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPLVLKAATA 
TMPAS WGQRPT IAMVTAINSQKAVLSTDVQNTP VNLQTS S KVT 
GPGAEAVQI VAKNTVTLQVQATPPQP I KVPQFI PPPRLTPRPNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS IHPVRWNGQTATIAKTFPMAQLTS I VI ATPGTRLAGP 
QTVQLS KPS LEKQTVKSHTETDE KQTES RT I TPPAAP KPKREEN 
PQKLAFMVSLGLVTHDHLEEIQSKKQERKRRTTANPVYSGAVPE 
PERKKSAVTYLNSTMHPGTRKRGRP PKYNAVLGFGALTPTSPQS 

RKSGQLLMCDTCSRVYHLDCLDPPLKT I PKGMWI CPRCQDQMLK 
KEEAI PMPGTTAI VHS YIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS I SKCMEMKNTILARQKBMHSSLEKVKQLIRL I H 
GIDLS KPVDS EATVGAISNGPDCTPPAKAATSTPAPSPSSQSCT 
ANCNQGEETK 


678S 


1 


528 


LGNTVLHYCSMYSKPECLKLLLRSKPTVDIVNQAGETALDIAKR 
LKATQCEDLLSQAKSGKFNPHVHVBYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVLAPTRELANHVSRDFKDI \TRKLTVARFYGGTSYQSQ 
INHIRNGIDILVGTPGRIKDHLQSGRLDLSKLRHVVLDEVDQML 
DLGFAEQVEDIIHESYKTDSEDNPQTLLFSATCPQWVYTVA\KK 
YMKSR YEQ VDLDGKOTQKAATTVEHLA IQCHVJSQRPAVIGDVLQ 
VYSGS GGRAI I FCETKKNVTEMAMNPH I KQNAQCLHGD I AQSQR 
E ITLKGFREGS FKVLVATNVAARGLDI PEVDLVIQSS PPQDVES 
YIHRSGRTGRAGRTGICICFYQPRERGQLRYVEQKAGITFKRVG 
ITwljVAil^r^AlK&liASVSYAAVDFFRPSAQRLIEEKGAV 
DALAAALAHI SGASS FEPRSLITSDKGFVTMTLESLEE IQDVSC 
AWKELNRKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 
WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD*VFYHLVDFLSDFLVDSVYLTGRQIDHLTGLTGLIDHLTSHS 
SVWN 


6787 


2*46 


2270 


PSS FPKNVPLEELEE PPK*KRSGLGSLTPKSQIQNGP * PQTFF F 
FELGSPSGVISAHCNLRLLGSSDSPAPASRVAGI IGTCHHAWLI 
LVFLVEMG FHHVGQAGLKLLTL\ VIH PPWPP KVLGLQT 


6788 


16 


936 


GGTVDLR\DMLAVSVLAAVRGGR/ATVRRVRESNVLHEK3KGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEHVD\ELDQ 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, Glycine, 
H=Histidine, I»Isoleucine, K-Lysine, 
L=Leucine, Methionine, N=Asparagine , 
PeProline, Q=Glutaraine, ReArginine, 
S=Serine, T* Threonine, V=Valine, 
W=Tryptophan f YoTyroeine, X-Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\«*posaiblc nucleotide insertion) 



EVILV7GS*DS*GYPKGK*I>LPKEVPSR/RVLLSGLTPIiDATQE\ 
FTEDLSK\ YVTTMVCVAVNGKPMLGVIHKP FSEYTAWAMVDGGS 
NVKARS S YNEKTPR I WS RS HSGMVKQVALQTFGNQTTI I PAGG 
AG YKVLALLD VPDKS QEKADL YIHVT YI KKWD 1 CAGNAILKALG 

GHMTTLSGEElSYTGSIXSlEGGLLASIRMNHQAXVRKIiPDLEKT 
GHK 



678 



6790 



4068 



GNGINVLKIAPESAIKFMAYEQIKRI,VW**PGDS*GF/YERLVA 
GSIiAGAIAQSSIYPMEVLKTRIflALRKTGQYSGMIiDCARRlLARE 
GVAAF YKGY VPNMLG 1 1 P YAG IDLAVYE TLKNAWLQHYAVNS AD 
PGVPVLLACX3TMSSTCGQLAS YPLALVRTRMQAQA5 IEGAPEVT 
MSSLFKHI LRTEGAFGZiYRGLAPNFMKVI PAVS IS YWYENLKI 
TLGVQSR 



6792 



1193 



APPAGRRRMQAAPRAGCGAALLLWIVSSCLCRAWTAPSTSQKCD 
EPLVSGLPHVAPSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQISAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KPYHQDGNI WAFPGNINSDGWRHELQHPI IARYVR I VPLDWNG 
EGRIGLRIEVYGCSYWADVINET3GHVVLPYRFRNKKMXTLKDVI 
ALNFKTSESEGVILHGEGQQGDYITLBLKKAKLVLSLNLGSNQL 
GPIYGHTSVMTGSLLiDDHHWHSVVIERQGRSINIjTLDRSMQHFR 
TNGEFDYLDLDYEITFGGIPFSGKPSS3SRKNFKGCMESINYNG 

vnitdlarrkklepsnvgnlsfscvepytvpvffnatsylevpg 

RLNQDLFS VSFQFRTWNPNGLLVFSHFADNLGNVE IDLTESKVG 

vhinitqtkmsqidissgsglndgqwhevrflakenfailtidg 
deasavrtnsplqvktgekyffggflnqmnnsshsvlqpsfqgc 

MQLIQVDDQLVNLYEVAQRKPGS FANVS IDMCAI IDRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNSIYEPSCEAYKHLGQT 
SNYYWIDPDGSGPMPLKVYCNMTEDKVWTIVSHDLQMQTPVVG 
YNPEKYS VTQLVYSASMDQ I SAI TDSAB YCEQYVS YFCKMSRLL 
NTPDGSPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
YYCNCDADYKQWRKDAGFLSYKDHLPVSQVWGDTDRQGSEAXL 
S VG PLRCQGDRNYWNAAS FPNPS S YLHFSTFQGETSADI S FYFK 
TLTPWGVFIiENMGXEDF I KLELKSATE VS FS FDVGNGP VE I WR 
S PTPLNDDQV7HRVTAERNVK0ASLQVDRLPQQIRKAPTEGHTRL 
EL YSQLF VGGAGGQQGFLGCIRSLRMNGVTLDLEBRAKVTSG F I 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDLAQ 
EEIRFSFSTTKAPCILLYISSFTTDFLAVIiVKPTGSLQIRYNLG 
GTREP YWIDVDHRNMANGQ PHS VNITRHBKT I FLKLDHYPS VS Y 
HL P S S S DTL FNS PKSLFLGKVIETGKIDQE I HKYNTPG FTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESMCGASPLTLSPM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAIIGGVI 
A\WIFTPSLCTP\VLP*SR*HVSPHKGTI,PIPNEAKGAGSRQK 
KPGRRPSMNKDppTSQRPlDESKKEWPfgiRGGYLAMG 



TGHEGAKGEKGDKGDLGPRGERGQHGPKGBKGYPGIPPEL/PGW" 
SAW* SWIiTAASTKVQAILLPQPLE* LGLQIAFMASLATHFSNQ 
NSGIIFSSVETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
EEVYVYIjMHNGOTVFSMYSYEMKGKSDTSSNHA\^KLAKGDEVW 
LRMGNGALHGDHQRFSTFAGFLLFETK 



1073 



VRHTNWGVDhYZJFSLGSESPKGAIGHI VSTEKXI LAVERNKVLL 
PPLMNRTFSWGFDDFSCCLGSYGSDKVLMTFENLAAWGRCLCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRLRQALYGHTQAV 
TCIAASVTFSLbVSGSQDCTCILWDLDHLTBIVTRLPAHREGISA 
ITISDVSGTIVSCAGAHLSLMNVNGQPLASITTAWGPEGAITCC 
CLMEGPAWDTSQI HTGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
AVSRWHTKLLVGDERGRIFCWSADG^EERGSRGSGTTVPG 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K«Lysine, 
L=»Leucine, M=Methionine, N=Asparagine, 
P- Proline, Q=:Glut amine, R=Arginine, 
S=Serine, ^Threonine, v» Valine, 
W=Tryptophan, Y»Tyrosine, X»Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


6793 


2340 


805 


GRKEANY\YGSIiTQAGTVSK3LDABGQEVFVPFSAVIiPMVAPND 
LVFDGWDISSLNIAEAMRRAKVLDWGLQEQLWPHMEALRPRPSV 
YIPEFIAANQSARADNLIPGSPAQQLBQIRRDIRDFRSSAGLDK 
VI VLWTANTERFCEVI PGLNDTAENLLRT IELGLE VS PS TLFAV 
AS I LEGCAFLNGSPQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKS VLVD FL I GS GLKTMS I VSYNHLGNNDGENLS APLQ FRS KEV 
S KSNWDDM VQSNPVLYT PGEEPDHCWI KYVP YVGDS KRALDE 
YTS ELMWK5TNTLVLHNTCEDSLLAAP IMLDLALLTELCQRVS F 
CTDMDPEPQTFHPVLSLLSFLFKAPLVPPGSPVVNAI»FRQRSCI 
ENILRACVGLPPQNHMLLEHKMERPGPSLKRVGPVAATYPMLNK 
KGPVPAATNGCTGDANGHLQEBPPMPTT*GPGHTVSRLFLPAAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 


169 


1344 


DDVXRKPEASAH*EKPGPPSRPGVRGGRERAGGRGSHGARSCR\ 
EPAP PAPAPPEDHPDEEMGFTI DI KS FLKPGEKTYTQRCRLFVG 
NLP TD ITEEDFKRLFERYGEPSE VFINRDRG FGF IRLESRTLAE 
IAKAELDGTILKSRPLRIRFATHGAALTVKNLSPWSNELLEQA 
FSQFGPVEKAWWDDRGRATGKGFVEFAAKPPARKALERCGDG 
AFLIiTTTPRPVIVEPMEQFDDEDGLPEKIjMQKTQQYHKEREQPP 
RFAQPGTFE FE YASRWKALDEMEKQQREQVDRNI REAKE KLEAE 
MEAARHEHQLMLMRQDLMRRQEELRRLEBLRNQELQKRKQ1QLR 
HEBEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR 


6795 


1740 


1010 


GPRRQTQ VRDI IELDS F* DWAAQETDCAQNSGERIi * KGV/ LENFS 
TMSKSAWISLDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNQ 
BKVNQ IQKT VI E PLKKFGS VFPS LNMAVKRREQALQD YRRLQAK 
VEKYEEKEKTGPVLAKLHQAREELRPVREDFEAKNRQLLEEMPR 
FYGSRLDYFQPSFESLIRAQWYYSEMHKIFGDLSHQLDQPGHS 
DEQRERENEAKLSELRALS I VADD ! 


6796 


48 


683 


GKE IQ I PTI KLAWLLFGLE * PVGALGKGWS F * * S HVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA*SSPSPVGACPSIiNPPET 
S VQEGRDCWQR* LPRLFSALVGQPGCWPQGAPPERCV* PGRCKW 
HLQSQVLR*ERRRCCRCLPRFA*GWRRRHQRLGLGIHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRFSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTIilRWWLLTASRVDPP 
ERPPPP PSDDLTLLESSS S YKNL/DAQI PQ/DWSMS PSTSG * RP 
LTSRASSIMRSRTAIPSAS*SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTETTASGSCLTWWSSSPAPCPSSSAPAHSFEASCCK 
TSLWGSCGGSGDGSSACGSGWNLSMAGTSCSSPAMCSPSRAPS* 
RS AS R PRTWRATTS AASS WAPRRCW 0GWA*S AT * PSSTTTI S SS 
PHCS3WPCPASCASAAAWLSSTWATASVAGSCWGPIM*SSAHSPW 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCX3SSPSSTFTPSS 
ASSSTWCSASSSRSSPAPTTPSSIPAAQAQRRASCRPTSHSART 
APPPAS SAAGAARPAAFSAAAEGTPRRS I RCW 


6796 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKEIjEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLIiAHKI QSPQEWEALQALTYLGDRVS EKVKTKVI ELLYSWTM 
ALPEE AK I KDAYHMLKRQGI VQSDPP I PVDRTLI PS P PPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETBDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSIiSSVLA 
PAPTPPSSGI PI LPPP PQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSIiFSTGVAPAIAPKVBPAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTS PI, I PTTTPARPLLPFSTGPGSPLFQ 
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amino acid 
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Amino acid segment containing signal peptide "" 
(A=Alanine, OCysteine, D«Aspartic Acid, E=* 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V«Valine, 
W«Tryptophan, Y=»Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








PLSFQSQGSPPKGPBIjSLASIHVPLESIKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLOPPSGTELSPFSPIQPPAAITQVMLLANPLKEKVRLRYK 
LTFALGEQLSTEVGBVDQFPPVBQWGNL 


6799 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKBLEG^VS 
ALWGQLRGSGLGRGTTMAKEGQ PGS PRLSAfcECVLLV PQ\ PQI A 
VIUjLAHKIQS PQEWE ALQALTYLGDRVSE KVKTKVI ELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTL FKIAS ETE DNDNS LGDI LQAS DNL S RV1NS YKT 1 1 EG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTlTSLSSVIiA 
PAPTP PS SGI PILPPPPQASGPPRSRSSSQAEATLGPSSTSHAIi 
SWLDEBLLCLGLADPAPNVPPKESAGNSQWHLIiQREQSDLDFFS 
PRPGTAACGASnAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVE PAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RI LFHFAKECPPGRPDVLWWSMLNTAPLPVKS I VLQAAVPKS 
MK7KLQPPSGTELSPFSPIQPPAAITQVMLLANPLKEKVRLRYK 
IiTFALGEQLSTEVGEVDQFPPVEQWGNL 


6800 


404 


1646 


RRSPSTGLSPVPQPSSPSIiSDYSiPWSLLIiSGTIAWATPGK*AG 
* PQAW*LGLAPAIAFI /GLTRGRKQNKBKMAEGGSGDVDDAGDC 
SGARYNDWSDDDDDSNESKS I VWYPPWAR IGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 
AALIALGNNAAYAFNRDI IRDLGGLP I VAK I LNTRDPI VKEKAL 
I VIjNNLS VNAENQRRLKVYMNQVCDDTI TSRLNSSVQIAGLRLL 
TNMTVTNEYQHMLANS ISDFFRLFSAGNEETKLQVLKLLLNIiAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVI LKLLVI FENINDN 
FKWEENEPTQNQFGEGSLFFFLKEFQVCADKVLGIESHHDFLVK 
VKVGKPMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESO^ASVTMHDVDAESFEVLVlJYCimkVSLSEANVBRL 
YAASDMLQLEYVREACASFLARRLDLTNCTAILKFADAFGHRKL 
RSQAQS Y I AQNFKQLSHMGS IREETLADLTLAQLIAVLRLDSLD 
VESEQTVCHVAVQWLEAAPKERG PSAAE VFKCVRWMH FTEEDQD 
YLEGLLTKP I VKKYCLDVIEGALQMRYGDLL YKSLVP VPNSS S S 
/R* QQQLS CICSRKSTPETGYVCQGDGDLLWTPQRSLS \RYDpY 
SGDI YTMPS PLTSFAHTKTVTSSAVCVSPDHDI YLAAQPRKDLW 
VYKPAQNS WQQLADRLLCREGMDVAYLNGYI YILGGRDPI TCVK 
LKEVECYS VQRNQWALVAP VPHS FYSFEL I WQNYLYAVNS KRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDEI YCICDI PVMKVYN 
PARGEWRRI SN I PLDS ETHNYQ I VNHDQKLLL ITSTTPQW KKNR 
VTVYEYDTREDQW INIGTMLGLLQFDSGFI CLCARVYP S CLEPG 
QSFITEEDDARSESSTEWDIjDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


' dB02 


157 


1341 


ETFPLE* FFLLS KTTPGKTASMAft FVQGfrS RM I AAESSTEkKECAE 
PSTRKNLMMSLEQKIRCLEKQRKELLEVNQQWDQQFRSMKELYE 
RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREEKEKSRLNEELHELKEENKLLKGKNTLANKEKEHYEC 
EIKRLNKALQDALNIKCSFSEDCLRKSRVEFCHEEb4RTBMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLNRLNSQ 
I KACQME KEKLEKQLKQMYCPPCNCX3LVFHI*QD P WVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCLAPPPVCCQAG/PR 
TPGLK* S S CLWLPKC*NFRFII»S KBSPS VE VHTNRERQQATRER 
G 


6803 


1 


2203 


KLSGRP YRHMGVI/3TS KLYDIRKTI FTFTPQF IDQQQFYLALDN 
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Amino acid segment containing signal peptide 
(A=> Alanine , CoCy3teine, D=Aspartic Acid, Bo 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
PsProline, QcGlutamine, R«Arginine, 
SaSerine, T=Threonine, V«Valine, 
W-Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTCQPTITFPISHSMLDriDfeTSl^fS ' 
S I LAALRKMQDG YPGGARVQTG KLS E FLTTS CCTHLS FMDPGP E 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPH P KLA PT S QKGGLDR FQ AAVQTT CDLMS LVT KAKELH VQ 
NVHMYLPTKLFQASRPS FNLLDS PHPRQENQVFSVRVE IHLPRD 
QSGEVDFKALVLQLKETSSLQEQADILYMLYTMKGPDWNTELYN 
ERSATVRELIiTELYGKVGBIRHWGLIRYISGILRKKVEALDEAC 
TDLLSHQKHIjTVGLP PE PREKT I SA PLP YEALTQLI DE AS EG DM 
SISILTQEIMVYLAMYMRTQPGLFAEMPRLRIGLIIQVMATELA 
HSLRCSAEEATEGLI^LSPSAMKNLLHHILSGKEFGVERK/SVR 
PTDSNVS PAI S IHE IGAVGATKTERTG IMQL KSB IKQVE FRRL S 
I SAESQS PGTSMTPS SGSFPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVPVGFYQKVWKVLQKCHGLS VEG FVL P SSTTREMTPGE I K 
FSVHVES\VLNVLIiRPEYRQLLVEAII>VLTMLADIEIHSIGSII 
AVEKIVHIANDLFLQEQKTLGP\DDTMLAKDPASG\ICTLR\YD 
SAPSGRFGTMTYLS\RAA\ATYVQEFLP\HS ICAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQTVAEEES C3 PSVBLEKPPPVNVDS KP I EE KTVEVNDRKAEFP 
SSGSNFSA*IPLPYLHLNRIiHQSL*QKGSRQQSSVTVSEPLAPN 
QEEVRSIKSETDSTIEVDSVAGELQDLQSERE*LASRF*CQCb'L 
KQ* *SARTRTS*KSLYRSEKSERCSGRRKFI KKAEKKP * SNSGK 
QQKEGKRHK 


6805 




206 


RQPDLKYFGKSFDVSVSESSSLLSNDLPKFADGIKARNRNQNYL 
VPSPVLRILDHTAFSTBKSADIVICDEECDSPESVNQQTQEESP 
I EVHTAEDVP I AVEVHAI SED YD I BTENNSS E S LQDQTDEE P PA 
KLCK I LDKS QALNVTAQQKWP LLRANS SGL Y KCELCE FNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLL I EHAKLHEEDP Y I 
CKYCDYKTVIFENLSQHIADTHFSDHLYWCEQCDVQFSSSSELY 
LHFQEHSCZDEQYLCQFCEHETNDPEDLHSHVVNEHACKLIELSD 
KYNNGEHGQYSLLSKITFDKCKNFFVCQVCGFRSRLHTNVNRHV 
AIEHTKIFPHVCDDCGKGFSSMLE\IAKHLNSHLSEGIYLCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERELISHLP 
VHETT 




272 


3794 


VALCFPNSDPVMFMDAFYGCLLAELGPVPIBVPL.TRKDAGSQQV 
GFLLGSCGVFLALTTDACQKGLPKAQTGEVAAFKGWP PLS WLVI 
DG KHLAKPP KD WHPLAQDTGTGTAY I E Y KTS KEGSTVGVTVSHA 
S LLAQ CRALTQACGY S EAETLTNVLDF KRDAG LWHG VLTS VMNR 
MHWSVPYALMKANPLS WIQKVCFYKARAALVKS RDMHWSLIiAQ 
RGQRDVSLSSLRMLrVADGANPWSISSCDAFLNVFQSRGLRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLSVLTVQDVGQVMPGANVCWKLEGTPYLCKTDEVGEICV 
SS SATGTAYYGLLG ITKNVFEAVPVTTGGAP I FDRP FTRTGLLG 
F IGPDHLVF I VGKLDGLM VTGVRRHNADDVVATALAVE PMKFVY 
RGRI AVFS VTVLHDDR I VLVAEQRPDAS EEDS FQWMS R VLQA I D 
S IHQVG VYCLAL V PANTL P KAPLGG I H I SET KQR FLEGTLH PCN 
VI^CPHTCVTNLPKPRQKQPEVGPASMIVGNLVAGKRIAQASGR 
ELAHLEDSIX3ARXFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
STATCVQLH KRAE RVAAALME KGRL S VGDHVALVYP PGVDIilAA 
FYGCLYCGCVP VTVRP PHPQNLGTTL PTVKMIVEVS KSACVLTT 
QAVTRLLRSREAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
DVLAYLDFS VSTTG I LAGVKMSHAATSALCRS I KLQCELYPS RQ 
IAICLDPYCGLGPALWCLCSVYSGHQSVLVPPLELBSNVSLWLS 
AVSQ YKARVTFCCYS VMEMCITCGIX^OTG VLRMKGVNIiS CVRTC 
MWAEERP\RIALTQSFSKLFKDLGLPARAVSTTFGCRVNVAIC 
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Amino acid segment containing signal peptide " 
(A=Alanine, OCysteine, D=Aspartic Acid, £= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P» Proline, Q«Glutamine, R=Arginine, 
S=Serine, T«Threonine, V-Valine, 
WoTryptophan, Y-Tyrosine, X«unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDMRALRHDRVRLVERGS PHSLPLMESGKIL 
PGVKVI IAHTETKGPLGDSHLGE I WVSS PHNATGYYTV YGEEAL 
HADHFSARLS FGDTQTI WARTG YLGFLRRTELTDASGGRHDAL Y 
VVGSLDETLELRGMRYHPIDIETSVTRAHRS IAECAVFTWTNLL 
WWELDGLEQDALDLVALVTNWLEEHYLWGVWIVDPGVI P 
INSRGEKQRMHLRDGFLADQLDPI YVAYNM 


6007 


1444 


606 


VGHDTVHAMFTCFPKCI^FSPPVNVTVSPRSEESHTTTVSGGNG 
. S VFQAGPQLQALANLEARRGS IGAALSSRDVSGLP V YAQSGE PR 
RLTQAQVAAFPGENALEHSSDQDT WDSLRS PG PCS PLS SGGGAE 
SLP PGG P GHAEAGHLGKVCDFHLNHQQPS PTS VLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLiN 
RFPCX3MEVHSGQPJELESVVAVGEAMA\LKFPMGAMSYCLRDRSR 
FLFRLPMGLSCPLQVQ 


6808 


2063 


737 


GVGSGAASALARSRPLASRLSSRRRTRAPRSGAMQRLAMDLRML 
SRELSLYLEHQVRVGFFGSGVGLSLILGFSVAYAFYYLSSIAKK 
PQLVTGGES FSR FLQDHCP WTBTYYPTVWCWEGRGQTLLRPF\ 
ITS KPPVQYRNELIKTADGGQISLDWFDNDNSTCYMDASTRPTI 
LLLPGLTGTSKES YI LHMIHLSEELGYRCVVFNKRGVAGENLLT 
PRTYCCANTEDLETVIiniVHSIiYPSAPFIjAAGVflMGGMLLIiNYI* 
GKIGSKTPLMAAATFSVGWNTFACSESLEKPLNWLLFWYYLTTC 
LQS S VNKHRHMFVKQVDMDHVMKAKS I RE FDKRFTS VM FGYQT I 
DDYYTDASPS PRLKSVG IPVLCLNS VDDVFS PSHAIPI ETAKQN 
PNVALVLTS YGGHIGFLEG I WPRQS T YMDRVFKQFVQAMVEHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
T0EAAQTDSQPLHPSDPTEKQQPKRIjHVSNI pfrfrdpdlrqmf 
GQFGKILDVEI I FNERGSKGFGFVTFETSSDADRAREKLNGTIV 
EGRKIEVWATARVMTNKKTGNPYTNGWKLNPVVGAVYGPEFYA 
VTGFPY PTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPP P I PTYG 
AWYQDGFYGAE I \LEATQPTDTIiS PLQRRQ PTATVTAESTQLP 
TRT I TPSGPRRP TALEPCETFHRFLLGP 


6 8X0 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGKI LDVE 1 1 FNERGS KGFGFVT FETSSDADRAREKLNGTIV 
EGRKIEVNNATARVMTNKXTGNPYTNGWKLNPVVGAVYGPEFYA 
VTGFP YPTTGTAVAYRGAHLRGRGRAVYNTFRAAP PPPP I PTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTATVTAES TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6B11 


1522 


656 


DLVTVWS FVDCR VIASTHGH \ KS WVS WAFDP YTTS VEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTOSRPVSVTYRFG 
SVGQDTQLCLWDLTEDILFPHQPIiSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRfiNSLPHSAVSNAGSKSSVMDGAIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDPAKTIiGTPLCPRMEDVPIiIiEPLICKKIAHERLTVLlFLEDCI 
VTACOEGFICTWGRPC3KVVQPTJD 


6812 - 


4001 


1*82 


EDAVFSLDLSTI IQGTWFLNGEELKSNEPEGQVE PGALR YR IEQ 
KGLQHRL ILHAVKHQDSGALVGFSCPGVQDSAALTIQES PVHI L 
SPQDKVSLTFTTSERVVLTCELSRVDFPATWYKDGQKVEESEIjL 
WKMDGRKHRLILPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFVVLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKVYVAAYRLERWLTCELCRPWAEVRWTKDGE 
EWESPALLLQKEDTVRRLVLPAVQLEDSGEYLCEIDDESASFT 
VTVTEPPVRI IYPRDEVTLIAVTLECWLMCELSREDAPVRWYK 
DGLE VEES BAXVLERDGPRCRIiVLPAAQPEDGGE FVCDAGDDSA 
FFTVTVTEPPVQFLALETTPSPLCVAPGEPVVLSCELSRAGAPV 
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Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G«Glycine, 
H»Histidine, I=Isoleucine, K=l»ysine, 
L=Leucine, M=*Methionine, N«Asparagine , 
P=Proline, QoGlutamine, R=Arginine, 
S=Serine, T»Threonine, V«Valine, 
W^Tryptophan, Y-Tyrosine, XsUnknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VWSHNGRPVQEGBGLBLHAEGPRRVLCIQAAGPAHAGLyTCQSG 
AAPGAPSLS FTVQVAEPPVRWAPEAAQTRVRS TPGGDLE LWH 
LSGPGGPVRWYKDGJSRIASQGRVQLEQAGARQVLRVQGARSGDA 
GE YLCDAPQDSRI FLVSVEEPLLVKLVSDLT P LTVHEGDDATFR 
CEVS P PDADVTWLRNGAWTPGPQRQSCCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHL 
LHSRQGSQ I DQTECVI RMNDAPTRG YGRDVGNRTSLRV I AKSSI 
QRILRNRHDLLNVSQGTVFI FWGPSS YMRRDGKGQVYWNLHLIaS 
QVLPRLKAFMITRHKMLQFDELFKQETGQ\NRKISNTWLSTGWF 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQEAN/ARERNRMHGLNDALDNLRKWPCYSKTQKLSXIET 
LRIiAKNYIW/VLSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRSP YSTFYPPYHSPELTTPPGHG 
TLDNSKSMKPYNYCSAYES FYESTS PECASPQFEGPLSPPPINY 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPLGQOAMFRLPTD 
SHFPYDLHLRSQSLTMQDELNAVFHN 


6815 


906 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRS S EMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 




1 


803 


NLLKTHKF\LLGODEDSLHSVPVAQMGNYQEYLKTLASPLREiD 
PD QPKRLHTFGNPFKQDKKGMM I DEADEF VAG PQNKVKRPGEPN 
SPMSSKRRRSMSLLLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LIKPTLVHTDATIIHDGHEEKMENGQITPDGFLSKSAPSELINM 
TGDLM P PNQ VD S LS DDFTSLS KDGXj I QKPGSNAFVGGAKN CS LS 
VDDQKDPVASTI/3AMPNTLQITPAMAQGINADIKHQLMKEVRKF 
GRSK 


6817 


172 


3457 


I^MDSPKIGN&LPVIGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DEYC PACKEKGKLKALKTYRI SFQES I FLCEDLQCI YPLGS KSL 
NNLISPDLEECHTPHKPQKRKSLESSYKDSLLLANSKKTRNYIA 
IDGGKVLNS KHNGEVYDETSSNLPDSSGQQNP IRTADSLERNEI 
LEADT VDMATTKDPATVD VS GTGR PS PQNEG CTSKLEMPLESKC 
TS FPQALCVQW KNAYALCWIiDCILSALVHS EELKNTVTGLCSKE 
ESIFWRIiLTKYNQANTLLYTSQLSGVKDGDCKKLTSKIFAEIET 
CLNEVRDE I FIS LQ PQLRCTLGDM ES P VFAFPLLLKLETH I EKL 
FLYSFS WDFECS QCGHQYQNRHMKSLVTFTNVI PEWHPLNAAHP 
G PCNNCNS KS Q I RKMVLEKVS P I FMLHFVEGLPQNDIjQHYAFHF 
EGCIiYQITSVIQYRANNHFITWILDADGSWLECDDIiKGPCSERH 
KKFEVPASEIHIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 
KPVSLTS CS VGDAAS AETAS VTHPKD I SVAPRTLSQDTAVTHGD 
HLLSGPKGLVDN I LP LTLE ETXQKTAS V5QLNS EAFL\liENKP V 
AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
VVNTNMQS VQLNTEDTVNTKS VNNTDATGLIQGVKS VEI EKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSAS KP P P I S KP PAGPP 
S SNGTAAHPHAHAAS E VLE KS GST S CGAQLNHS S YGNG I S S ANH 
EDLVEGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRTVRSENLE 
QVPQDGS PNDCES IEDLLNELP YP IDI ANES ACTTVPGVSLYSS 
QTHEEIIiAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
E FND VS QNTHLRQDHN YCS P TKKNPCE VQ PDS LTNNACVRTLNL 
ESPMKTDI FDEFFSSSALNALANDTLDLPHFDE YLFENY 


6818 


2 


240 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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(A^Alanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, X-Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
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Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








DGGE/LHS/ATTEHKP/VQATPVNLT\TILTSTWQARLPQI 


6819 


1 


961 


G I PCTEMGNFDNANVTGE I E FA I H YC FKTHSLE I CI KACKNLAY 
GEEKKKKCWPYVKTYLLPDRSS QGKRKTGVQRNTVDPTFQETIiK 
iQVAPAQIjVTKQIjQVSVwHIaSTLAR 

TTQSFRWHPLRAKADKYEDSVPQSNGBLTVRAKLVLPSRPRKLQ 
BAQEGTDQPSLHGQLCLVVLGAKNLPVRPIX3TLNSFVKGCLTLP 
DQQKLRLKSPVLRKQACPQWKHSFVPSGVTPAQIiRQSSIiELTVW 

DQALFGMNDRLLGGT\RLGSKGDTAVGGnACSQSKLQWQKVLSS 

DMT wnriMTT \n u * 
rviJun x UnL h Vixn 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLKVVRKHHRVIA 
GQFFGHHHTDS FRMLYDDAGVPISAMFITPGVTPWKTTLPGWN 
GANNPA IRVFEYDRATLSLKDMVTYFMNLSOANAQGTPRWELEY 
QLTEAYG VPDASAHSMHTVIiDR I AGDQSTLQRYYVYNS VS YSAG 
V uUbALb My H V LAMKQ VDI DAY TTCLYASGTTPVPQLPLuLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN" 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFS LI EG Y I \ S I VMDAETQKKFPSDLLLTS S SGELWRMVRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYl\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLG FDE CG I VAQI AGPLAAAD I SAYY I ST FNFDHALVPEDGI 
GSVIEVLQRRQEGIAS 


6823 




221 


P PXLLS RWARMGHGDB I V \LSDLNF PG LLHLPWGPWRS VQTAC 
G I PQLLE AVLKLLPLDT YVES PAAVMELVPSDKERGLQTP VWTE 
YES I LRRAGCVRALAKIERFE FYERAKKAFAWATGETAL YGNL 
ILRKGVLALNPLL 


6824 


858 


104 


LL1AQRWGWG\ CCFFSLAVS VKMNVLLFAPGIiFLLJliXQFGFRG ' 
ALPKLGICAGLQWLGLPFLLENPSGYLSRSFDLGRQFLFHWTV 
NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGES I LS 
JaJjRDPSt^^PPQPLTPNQXVSTLFTSNFIGICFSRSLHYQFYV 
WYFHTLPYLLWAMPARWLTHLLRLLVLGLIELS WNTYPSTS CSS 
AALHICHAVILLQLWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASD IMWTISDTGWILI ILCSLMEPWALGACTFVHLli 
PKFDPLVILKTLSSYPIXSMMGAPIVYRMLLQQDLSSYKFPHLQ 
NCLAGGBS LLPBTLENWRAQTGLDIREFYGQTETGLTCMVS KTM 
KI KPG YMGTAAS CYDVQ 1 1 DDKGNVL P PGTEGD IG IRVKP I RP I 
G I FSG YVDNPDKTAAMI RGDFWLLGDRGI KDBDGYFQFMGRADD 
I INS SGYR IGPS BVBNALMEHPAVVETAVISS PDP VRGEVVKAF 
VILALvFLSHDPEQLTKEI^HVKSVTAPYKYPRKIEFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 
P FGPLALPMDG YGDSLWE EH E YKFCLALV ISTKLYHVRC 


6826 


2304 


954 


LKTES F KP W/ VN I ALAFHLLG ERAS PUS FWQP Y 1 QTLPRE YDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDS FTYBDY^WAVSS VMTRQNQI PTEDGSRVTLAL I PLW 
DMCNHTNG L ITTG YNLEDDRCECVALQD FRAGEQ I YI FYGTRSN 
AEFVIHSGFFFDNNSHDR\AKIKIiGVSKSDRLYAMKAEVLARAGl 
PTSS VFALHFTBP P I SAQLLAFLRVFCMTEE ELKEHLLGDSAI D 
RI FTLGNS EFP VS W DNEVKLWTFLEDRAS LLLKTYKTTIEEDKS 
VLKNHDLSVRAKMAIKIjRIX5EKEII^KAVKSAA\^EYY^QQME 

ekaplp ky e esnlg ll es s vgdsrlplvlrnle eeagvqdalni 
reaiskakatenglvngensipngtrsbneslnqeskravedak 
gsssdstagvke 
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PeProline, Q*Glut amine, R*Arginine, 
S=*serine, T=Threonine, V=»valine, 
W=Tryptophan, YsTyrosine, XaUnknown, *»Stop 
Codon, /^possible nucleotide deletion, 
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€1827 


1 


779 


SSVVEFGLSVLGGLFLLFVLENMIjGLLRHRGLRPRCCRRKRRNL 
ETRNLDPENGSGMAliQPLQAAPEPGAQGQREKNSQHPPALAPPG 
HQGHSHGHQGGTDITWMVUiGDGLHNIiTDGLAIGAAFSDGPSSG 
LSTTLAVFCHELPHELGDFAMLLQSGLSFRRLLLLSLVSGALGL 
GGAVLGVGLSLGPVPLTPWVFGVTAGVFLYVALVDMLPALFPSS 
GAPAYA\HVLLQGLGLLLGGCLMLAITLLEERLLPVTTEG 




1 3 


1654 


KSQHG/ WI LQLMHS CKEG YVKDIiKGNPGLHRAMLDIiDNGTRPS E 
LGHLSQTASLKRGS S FQSGRDDTWRYKTPHRVAFVEKLTKLVLS 
QLPNFWKLWI S YVNGSLFSETABKSGQ1ERS KNVRQRQKDPKKM 
I Q EVMHS L VKLTRGALL PLS I RDGEAKQYGGWEVKCELSGQWLA 
HAIQTVRLTHBSLTALE I PNDLLQT1QDL I LDLRVRCVMATLQH 
TAEEIKRLAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLKGVLE 
CKPGEASVFQQPKTQBEVOQLS INI MQVFI YCLEQLSTKPDAD I 
DTTHLSVDVSS PDLFGS IHEDFSLTSEQRLLI VLSNCCYLBRHT 
FLNIAEHFEKHNFQGIEKITQVSMA3LKBLDQRLFENYIELK?U) 
PIVGSLEPGIYAGYFDWKDCLPPTGVRNYLKEALVNIIAVHAEV 
FTX SKELVPRVLS KVI EAVSE ELSRLMQCVS S FS KNGALQARLK 
I CALRDT VAVYLTPES KS S F KQ ALEALPQkS 5GADKKLLEELLN 
■ KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 t 


MRMEAGEAAP PAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTLCR 
EQKS FLS RLCQGEELQSDRDETGAYLIDRDPT YFGPI LNFIiRHG 
KLVLDKDMAEEGVLE EAE F YNI G PL I RI 1 KDRMEEKDYTVTQVP 
PKHVYRVLQCQEEELTQMVSTMSDGWRFEQLVNIGSSYNYGSED 
QABFtiCWSKELHSTPNGLSSBSSRKTKSTEEQLEEQQQQEEEV 
EBVEVEQVQVEADAQEK/CCYKPEAPGCEAPDHLQGLGVPI 


6830 


1 


939 


MEPGSVENLSlVYRSRDFLWNKHWDVRIDSKAWRETLTfWkQL 
RYRFPELADPDTCYGFRFCHQLDPSTSGALCVALNKAAAGSAYR 
CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 
EGSQGCENPKPSLTDLWLEHGLYAGDPVSKVLIjKPLTGRTHQL 
RV\HCS ALGH P WGDLT YGE VSGREDRPFRMMLHAFYLR I PTDT 
EC VE VCTPDP FLPS LDACWSPHTLLQS LDQIjVQAIjRAT PDPDPE 
DRGPRPGS PSALLPGPGRPPPPPTKPPETEAQRGPCliQWLSEWT 
LEPDS 


6831 


3 


1087 


SLFFGSSTPDNKVAEQEt)LE'fOP$PSV2KAVTVlDPEGTIPTNF 
NVAEKPADHSLS E VKL KTADEPRGTLVKSGDGQNVKEKSM I LSN 
VEDLQQPKFI SE VSREDYGKKE I SGDSEBMN INS WTSADGENL 
EIQSYSLIGBKliVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
3 1 FREE PRSDQKQKSLIiS FDWDKVPQQPKSASSNFASKNITKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
LRS VS PTEKKDNLENR \ S YTlAAE KKVLAEKQNS V\ APLE LRDS 
NE IGKTQITLGSRSTELKES KADAMPQHFYQNEDYNERPKI I VG 
SEKEKDEKKXK 


6832 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VS LKKKRSEDD YE P 1 1 TYQ FPKRENLLRGQQEBEERLLKAI PLF 
CFPDGNEWASLTBYPRETFSFVLTNVDGSRKIGYCRRIiPAGPG 
PRLPKVYCI I S C I GCFGLFS KI LDE VEKRHQ I SMA VI YP FMQGL 
REAAFPAPGKTVTLKSFI PDSGTKFI SLTRPLDSHLEHVDFSSL 
LHCLSFEQILQX FASAVLERXIIFLAEGLSTLSQCIHAAAALLY 
PFS WAHTYIP WPESLLATVCCPTP FMVGVQMRFQQEVMDS PME 
EVLLVNLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINELKT 
AEQ INEHVSGPFVQFFVKI VGHYAS YI KREANGQGHFQER3 FCK 
ALTSKTNRRFVKKFVKTQLFSLFIQEAEKSKNPPAGYFQQKILE 
YEEQKKQ/TETKGKNCEI RAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGIiPKGPRVKSTRPGSSDIN 
VAPGBQGPDQEBTNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
QVNGNLVRE PDHMELEEDRAGQLNMRGVFLHVLGDAliGSVI VW 
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Codon, Apossible nucleotide deletion, 
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NALVFYFSWKGCSEGDFCVNPCFPDPCKAFVE I INSTHAS VYEA 
GP C WVLYLDPTtXIVVM VCILLYTT YPLLKESAL ILLQTVPKQl D 
IRNLIKELRNVEGVEEVHELHVWQLAGSRIIATAHIKCEDPTSY 
ME VAKTI KDVFHNHG I KATT IQ PE FASVGS KSS WPCELACRTQ 
CALKQCCGTLPQAPSGKDAEKTPAVS ISCLBLSNNLEKKPRRTK 
AENI PA\ WIE IKN\ I PKK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRLLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCGSSASAYGWH*RLTPWSPGGS*HM*SSKAPVTQAREVLVAGP 
CSKLVLSGARG I VGTTVQVLVEAQQPLLLLFTGVWG LNLRAGEE 
SRAL *LIEEVTQVRDAHLGNAWGCAQCLSQGQVGSALAKALLE 
AAAAVRDCKEVLTVSGDKQQAEVS VRL * VRD VCVEEAGCVEFX3Q 
AHGRPGLAIiAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
Q<3 DGE DQAARTRLLQ AG AHS VAHG RRQGQAP CRPHQEAG VS CHE 

LQQWGDAL*ARE+APQI IVLLLLEDVAQLRTGKKA*DLWDVE 
QLLRQL 


6835 


1 


834 


G I PAADR \ EASLELI KLDISRTFPNIjCI FOXK3GP YlibMLHS lEG 
AYTC YRPDVG YVQGMS F I AAVL 1 LNLDTADAFI AFSNLLNKPCQ 
MAFFRVDHGLMLTYFAAFEVFFEENLPKLFAHFKKNNLTPDIYL 
I DW I FTL YS KSLPLDIiACRI WDVFCRDGEBFLFRTALGILKLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFAS IATIQMQSRNKKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAL 


6836 


1 


850 


MSCGRPPPDVDGMITLKVVD^TVRTSpDdLRRVFEKV6RVGD\T' 

YIPREDHTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 

QVAR YGRRDL PRSRQGRRHAAGPE AA/RYGRRSRS YGRRS RS PR 

RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 

PYSRSRYRESRYGGSHYSS3GYSNSRYSRYHSSRSHSKSGSSTS 

SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 

KSRSRSKRPPKS PEEEGQMSS 


6837 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/ GGPRTRRP \SGTSS SGS KA&GP 
PNP PAQGOGTSLS PNYTLES TSGNDGKPVSGGGGRGRGRRKRDS 
GHVSPGTFFDKYSAAPDSGGAPGVSPGQQOASGAAVGGSSAGET 
RGAPTPHEKALTS PS WGKGAELLLGDQPDL IG S LDGGAKS DSS S 
PNVGEFASDE VS TS YANE DE VSS S S DNPQALVKAS R S PLVTGSP 
KLPPRGVGAGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPIiEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDSELGSCCSEAVKSAMSTI 
C L DS LMAEHS AAWYM P ADKAL VDS ADDDKTLAP W EKAXPQlsTPNS 
KEAHD LP ANKAS ASQPG S HLQCLS VHCTDDVGDAKARAS VP TWR 
SLHSDISNRFGTFVAALT 


6838 


16 


499 


bTDTPPPKTHMIHHSISDYKATLRCWALGFYPMEITLTWQQDEE 
DQTRDMELVETRPAGDGTFQKWAAWVPS GEB /Q/RYMCH VQHB 
GLPEPLTLRWEQSSQPTI PI VGI VAGLVLLGAVVTGAVVSAVMC 
RKKNSDR VS YS EAASSDHAQGSD VS LTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGLS WPOVKRLDALLSFP TP t ucz 
RGNFPTLS VQPRQIRAGG PQHPGGAG \ IHVHR VRLHGSAASHVL 
HPESGLGYKDLDLVFRMDLRSEASFQCrKAWLACLLDFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDSVRRQFEFS IDSFQ 1 1 LDSLLL FGQCS STPMS E AFHPTVTG 
ESLYGDFTBALEHLRHRVI ATRS PEE I RGGGLLKYCHLLVRGFR 
PR P S TD VRALQR YMCSR F F I DFPDL VEQRRTL ER YLEAH FGG AD 
AARR YACLVTLHRVVNESTVCLMNHERRQTLDL I AALALQALAE 
QGPAATAAXAWRPPGTDGVVPATVNYYVTPVQPLLAHAYPTWLP 
CN 


6840 


4254 


2061 


ELQGDFSVPDVPKSMAWCENS I CVGFKRDYYL I RVDGKGS I KEL 
FPTGKQLEPLVAPIJU^KVAVGQDDLTVVIiNEBGICTQKCALNW 
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Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TDIPVAMEHQPPYIIAVLPRYVKIRTFEPRLLVQSIELQRPRFi~ 
TSGGSNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFELALQLA 
EM KDDSD S 5 KQQQ IHH I KNLYAFNLFCQKRFDESMQVFAKLGTD 
PTHVMGL Y P DLLPTDYRKQLQYPNPLPVLSGAE LE KAHLALIDY 
LTQKRSQLVXKLNDSDHQSSTSPLMEGTPTIKSKKKLLQIIDTT 
LLKCYLHTNVALVAPLLRLBNNHCHIEESEHVLKKAHKYSELI I 
LYEKKGLHEKAL03^VIX3SKKANSPLKGHERTVQYIiQHLGTENL 
HLI FS YSVWVIiRDFPEDGLKI FTEDLPEVES LPRDRVLGFLIEN 
FKGLAIPYLEHIIHVWEETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
PFDGLI^ERALLLGRMGKHEQALFIYVHILKDTRMABEYCHKHY 
DRNKDGNKDVYIjSLLRMYLSPPS ihclgpi klelle pkanlqaa 

LQ VLELHHS KIiDTTKALNLLFANTQIND IR I FLEKVLEENAQKK 
RFNQVLKNLLHAEFLRV\QEERIIjHQQVKCI iteekvcmvckkk 
I GNSAFARYPNG WVHYFCS \ KEVNPADT 


6841 


1 


3206 


TPSTTCTKSNTPTSSVPSAAVTPLNEStQPtGDYdVGSKNSKRA ' 
REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 
MCPETRLDRTGSSPTQGIVNKAFGINTDSLYHELSTAGSEVIGD 
VDEGADLLGE FSGMGKE VGNLLLENSQLLETKNALNWKNDLI A 
KVDQLSGEQE VLRGELEAAKQAKVKLENRI KELEEELKRVKSEA 
I IARRE PKEEAEDVSS YLCTESDKI PMAQRRRFTRVEMAR VLME 
RNQ YKERLMELQEAVRWTEMI RAS REHPSVQE KKKS T I WQ F FS R 
LFSS SSS PP PAKRPYPSGNIHYKS PTTAGFSQRRNHAMCP I SAG 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SLPAKYKQLSPNGGQEDTRMKNVPVPVYCRPLVEKDPTMKLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATS SRVW I LTSTLTTSKWI IDANQPGTWD 
QFTVCNAHVLCISSiPAASDSDYPPGEMFIiDSDVNPEDPGADGV 
LAG I TLVGCATRCNVPRSNCSSRGDTPVLDKGQGEVATI ANGKV 
NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSG PQPGS ENGPE PDS S S TR PEPEP SGDPTGAGS SAAPTM WLG 
AQNGWL YVHS AVANWKKCLHS X KLKDS VLSLVHVKGRVLVALAD 
GTLAI FHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVMCG 
YKNKVHVIQ PKTMQIEKS FDAHPRRESQ VRQLAW I GDGVWVS IR 
LDSTLRLYHAHTHQHLQDVDIEPYVSKMLGTGKLGFSFVRITAL 
LVAGSRLWVGTGNGWISIPLTETVVIJIRCKJ\LW5\IJ?ANKT5P 
TS GEG \ ARPGG \ I IHVYG\DDSSDRaaRSFIPYCSMAQAQLCFH 
GHRDAVKFFV S VPGNVLAT LNGS VLDS PAEGPGPAAPASEVEGQ 

KLRNVLVLSGGEGYIDFRIGDGEDDETEEGAGDMSQVKPVLSKA 
ERSHIIVWQVSYTPE 


6842 
"~6843 


3 


92* 


RCQQIjSATILTDHQ YLERTPLCAIIjKQKAPQQYR I RAKLRS YKP 
RRLFQSVKLHCPKCHLLQEVPHEGDLDI I FQDG AT KTPDVKLQN 
TSLYDSKIWTTKNQKGRKVAVHFVKNNGILPLSNECLLLIEGGT 
LS E I CKLSNKFNS V I P VRSGHEDLELLDLSAPFLI QGTVHHYGC 
KQWST * RS I QNLNSLVDKTS WI PSS VAE ALt3 1 vPT.nvrimnvt'Piw 
LDDGTGVLEAYLMDS DKFFQ I PASEVLMDDDLQKS VDMI MDMFC 
PPGI KI DAYPWLECFI KS YNVTNGTDNQIC YQ I FDTT VAEDVI 




2 


851 


NHRKVLSGAKRYECNKCGKS FAYTSSLI KMRU1HTGERPYBCSE 
CGRS FAENS SLIKHLRVHTGERP YE CVE CGKSFRRS SSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSLI IHIiRVHTGERPYECSDCGKSFAENSSI, IKHLR VHTCE 
RP YECIDCGKS FRHS SS FRRHQRVHTGMRPYK* S KFWKFS CPGF 
LLLQGQR VHTGS RCYECDKWG I FFS *NAS FFT* KSAPTEEVPFE 
CNE CEKA FSPIiSLVTTI FT 


£844 


244 


642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKSEIYSFGIVLWEIATGDIPFQGCNSEKIRKLVAVKRQQE 
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Amino acid segment containing signal peptide 
{A-Alanine, C«Cysteine r D=«Aspartic Acid, E*= 
Glutamic Acid, P= Phenyl alanine, G«Glycine, 
H=Histidine, I«Isoleucine, K=Lysine. 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
WaTryptophan, Y^Tyrosine, X«Unknown, *»Stop 
Codon, /«possible nucleotide deletion, 
\»pos3ible nucleotide insertion) 








PLGEDCPSEIiREIIDBCRAHDPSVRPSVDEILKKLSTFSK*CIK ' 
I 


6845 


3 


1519 


VAVR DE C Y WRHVFWDQDLWMLLF I LMCHP ETARARLS YrI RT&D 
GALENAQNLGYQGAKFAWESADSGLEVCPED I YGVQE VHVNGAV 
GLAFELYYHTTQDLQLFREAGGWDWRAVAE FWCSRVEWSPRBE 
KYHLRG VMS PDE YHSG VNNS VYlWVIiVQNSLRFAAALAQDLGLP 
I PS QWIiAVADKI KVP FDVEQNFHPEFDGYEPGEWKQADWLLG 
Y P VP FSLS PDVRRKNLEI YEAVTS PQGPAMTWS MFAVGWMELKD 
AVRARGIjLDRSFANMAEPFKVWTEMADGSGAVNFLTGMGG FLQA 
WFGCTGFRVTRAGVTFDPVCLSG I SR VS VSGI FYQGNKLNFS F 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSS EFPGRTFSDVRDPLQS Pr,WVTLGSSSP 
TES LTVDPAS E * SGTGASETSLGPS LWPRLHP PLIiGTLLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFIiKTIK*LNRLAEHP*YENEKLTKLRNTIMEQYTRTEESARG 
1 1 FTKTRQSAYALSQWITENEKFAEVGVKAHHLIGAGHSSEFKP 
MTQNEQKBVI S KFRTGK I NLLIATTVAEEGLD I KECNXVTR YGL 
VTNEIAMVQARGRARADESTYVLVAHSGSGVIEHETVNDFREKM 
MYKAIHCVQNMKPEEYAHKILBLQMQSIMEKKMKTKRNIAKHYK 
mPSLITFLClQICSVIiACSGEDIHVIEKMHHVNMTPEFKELYIV 
RENKTLQKKCADYQINGEI ICKCGQAWGTMMVHKGLDLPCLKIR 
NFVWFKNNSTKKQYKKWVEIiP ITFPNLDYSECCLFSDED 


6847 " 


1450 


348 


SMCWNSDRLEMPLlDLALILYPPSYVPYTGHLSDDSLSRKYCLf * 
WFEDALNG VL* RAE AIQPHCVNAGDRMEKFRQKYWNKLQTLRQQ 
PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 
PGWRS LDALGWEERQIiALVKGLLAGNVFDWGAKAVSAVLESDP 
YFGFEEAKRKLQERPWLVDSYSEWIiQRLKGPPHKCALlFADNSG 
IDI ILGVFPFVRELLLRGTEVIIACNSGPAIjNDVTHSESLIVAE 
R I AGMD PWHS ALREERLIiljVQTGSS S PCLDLS RLDKGLAALVR 
ERGADLWI EGMGRAVHTNYHAALRCESLKLAVI KNAWLAERLG 
GRLFS VI FKYEVPAE 


6848 


19 


16 


AMWWNSLDGIRNIVLSNPKKRNTLSLAMLKSLQSDlLHDADSND " 
LKVI 1 1 SAEGPVFSSGHDLKELTBEQGRDYHAEVFQTCSKVMMH 
IRNHP VPV3 AMVNGIiATAAGCQLVAS CD IAVASDKS S FAT PGVN 
VGLFCSTPGVALARAVPRKVALEMLFTGEPISAQEALLHGLLNK 
WPEABLQEETMRIARKIASLSRPWSLGKATFYKQLPQDLGTA 
YYLTSQAMVDNIiALRDGQEG I TAFLQKRKPVWSHEP V* VEH 


6849 


70 


821 


SLGVDGSCLEQGS PAPRPQTDTSP * P VGNWATGQEDLYtiQS YE - C 
VCVLFASVPDFKEFYSESNI NHEGLECLRLLNE I IADFDELLSK 
PKFSGVEKIKT IGS TYMAATGLNATSGQDAQQDAERSCSHLGTM 
VEFAVALGSKLDVINKHSFNNFRLRVGLNHGPVVAGVIGAQKPQ 
YD I WGNTVNVAS RMESTG VLGKIQVTEETAWALQSLGYTC YSRG 
VI KVKGKGQLCT YFLNTDLTRTGPPS ATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQELHLFMLSGVPDAVFDLTD 
JjDVLKI,EIjIFEAKIPAKISQMTNI^ 

RDHLRCLHVKFTDVAEIPAWVYLLKNLRELYLIGNLNSENNKMI 
GLaESLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVlHNDGT 
KLLVLKSLKKMMNVAELELQNCELERI PHA1FSLSNLQELDLKS 
NNIRT1EEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESL 
Y FSNNKLESLP VAVFSIjQKLRCLDVS YNNI SMI P I E I GLLQNLQ 
HLHITGNKVDILPKQLFKCIKLRTLNLGQMCITSLPEKVGQLSQ 
LTQLELKGNCLDRLPAQUGQCRMUCKSGLWEDHLFDTLPLEVX 
EALNQDINIPFANGI 


6851 


17*5 


660 


VSAQVSAREGENCLGWNLADSSQESYKSLEEAEDCYPPSLLTLD 
LRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGLMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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Amino acid segment containing signal peptide " 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H«Histidine, I»Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=sAsparagine, 
P= Proline, Q=Glut amine, R»Arginine, 
S=Serine, T»Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Loaon » /Bpossioie nucleotide deletion, 
\»possible nucleotide insertion) 








MARPWTEDGDWTEPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE " 
HKNTWSAQNCKNGSCVLDLSKCLPIQGKLLFAEPKDAGFPFSQD 
INSHLAS LS MARNTS PTPDPTVREAL CAPDNLNAS I KSQGQ I KM 
YINE VCR ET VSRCCNS FIXJQAGLNLLISMTV I NNMLAKS ASDLK 
F PLI SEGSGCAKVQVLKPLMGL SEKP VLAGEl»VGAQML FS FMS L 
F I RNGNRE I LLETPAP 


6852 ' 


1 


407 


RTRG EET YANFIKHNDGKN I FYAART PATLFAVM FAM Y 1 1 SGLT ' 
GFIGLNSIAVLC3JLVMGLALIFLCTWAYVKYSGEFRBIGTVIDQ 
IAETLWEQVLKPLGDNLMEENIRQSVTNSIKAGLTDQVSHHARL 
KTD 


6853 


3 


469 


GDS CAVCIEliYKPNDL VRILTCWli i FlU^YcVDP^l^EHRTCPMC 
KCDILKALGIEVDVEDGSVSLQVPVSNEIFNSASSHEEDNRSET 
AS SG YAS VQG TYEPPLEEHVQS TNE SLQLVNHEANS VA VD V I PH 
VDNPTFE EDETPNQETAVRE I KS 


6854 


1148 


585 


HESYIGTFDPGELCVCAAIQWIiQDNSASYFLNRKLVYEPSTQAK 
PVKNTFLRMWI YSHHI YQQDLRKKI LDVGKRLDVTGFCMTGKPG 
IICVEGFKEHCEEFWHTIRYPNWKHISCKHAESVETEGNGEDLR 
LFHS FE ELLLEAHG DYG LRND YKMNLGQFLE FLKKH KS E H V FQ I 
LFGIESKSSDS 


6855 


1913 


1148 


GRVGGRVGRlCSPLSGANEYIASTDTIjkTEEVLIiFTDQTDDLAK 
EEPTSLFQRDSETKGESGLVIiEGDKEIHQlFEDLDKKLAIiASRF 
Y I PEGCIQRWAAEMWAtPALHREG I VCRDLNPNNILLNDRGHI 
QLTYFSRWSEVEDSCDSDAI ERMYCAPEVGAI TEETEACDWWSL 
GAVLFELLTGKTLVECHPAGINTHTTLNMPEWVSEEARSLIQQL 
LQFNPLERLGAGVAGVEDIKSHPFFTPVDWAELMR 


6856 ~ 


1617 


• $97 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR ' 
TVYRLTL VKA WNVDELQA YAQLVSLGNPDFI EVKGVTYCGES SA 
SSLTMAHVPWHEEWQFVRELVDLIPEYEIACEHEHSNCLLIAH 
RKFKIGGEWWTWINYNRFQELIQEYEDSGGSKTFSAKDYMARTP 
HWALFGASERGFDPKDTRHQRKNKSKAISGC 


6857 


1 1 


617 


KGPEATAMVCV^kPNCRQKrHIKPSkSAAQTMCGSPTPASAPNH 
KLMAMEQGKTLPS ATEDAKEEGLEAQ ISRLABL I GRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYLRGT 
TGVRNCFHITAVRLSDGFTFVIYEFWETEEAWKRHLQSPLCXAF 
RHVKVDTLSQPEALSRILVPAAWCTVGRD 


6858 


2 


669 


RSRGIKDFENDPPLSSCGIFQ3RIAGDALLDSGIRISSVFASPA " 
LRCVQTAKLILEELKLEKKIKIRVEPGIFEWTKWEAGKTTPTLM 
SLEEIiKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
xvaL wQI/roviijIVSHGSTIdJSCTRP 

KIPSLGMCFCEENKEEGKWELVNPPVKTLTHGANAAFNWRNWIS 
GN 


6859 


1 


1150 


\jninc wuuvi AAiuu^KiucauiiSijUXWJjSDI IQSPSSTGLLKSG 
KTNS VES LPELLTSDSEGS YAGVGS PRDLQS PDFTTGFHSD KI E 
AKVKPYVNGTSP VYSREDLKPWEKS P I LKISAPQPIPSNRI DTT 
SSAS WVAGS FSP VS PPWDLRTIME I EESRQKCGATPKSHLGKT 
VSHGVKLSQKQRKM lALTTKENNSGMNSMETVLFTPS KAPKP VN 
AW ASS LRS VSSKS FRDFLLEEKKSVTSHSSGDHVKKVSFKG I EN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAPVTFASIVEEELQQEAALIRSREKPLALIQIEEHAIQDLLVF 
YBAFGNPEE FVI VERTPQGPLAVPMWNKHGC 


6860 


1889 


1515 


DKDkKRQKKRGI FPkVATNIMRAWLFQHL'THP YPS EEQKKQCAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
MGSFVLDGQQHMG1RPAGPMSGMGMNMGMDGQWHYM 


6661 


1889 


1515 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQIiAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
HoHistidine, I«Isoleucine, K*=Lysine, 
L»Leucine, M=Methionine # N=Asparagine , 
P-Proline, Q=Glutamine, RsArginine, 
S»Serine, T»Threonine, VsValine, 
W»Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGSFVLDGO^HMGIRPAGPMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNSEGNA™ 
DEEDPIiGPNCTrYDKTKSFFmisa?DNRBRRPTWAEERRLNAET 
FGI PLRPNRGRGGYRGRGGLGFRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGREFADPEYRKTTAFGP 


6863 


2216 


487 


PQE PALKSEFSQVASNT I P LPL PQ PNTCKDNGPCKQVCSTVGGS 
AICSCFPGYAIMADGVSCEDQDECLMGAHDCSRRQFCVNTDGSF 
YC VNHTVLCADGY I LNAHRKCVDINECVTDLHTCSRGEHCVNTL 
GSFHCYKALTCEPGYALKDGECEDVDECAMGTHTCQPGFLCQNT 
KGS FYCQARQRCMDGFLQDPEGNCVDINKCTS LSEPCRPGFSCI 
NTVGS YTCQRNP L I CARG YHASDDGTKCVDVN E CETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GS YQCYCRQG YQLAEDGHTCTDI DE CAQGAG I LCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDEOUiGTHNCSEAETCHNIQGS 
FR CLRFECP PNYVQVS KTKCBRTTCHDFLECQNS PAR I THYQLN 
FQTGLLVPAHIFRIGPAPAFTGDTIALNIIKGNEEGYFGTRRLN 
AYTGWYLQRAVLEPRD FALDVEMKLWRQGS VTTFLAKMHI FFT 
TFAL 


6864 


2 


2933 


LKDSS PSNLQI IIKELLSMHHQPDPALTKBFDYLPP VDSRSSSG 
FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 
FYQVQS LFGHLMES KLQYYVPENFWK1 FKMWNKEliYVREQQDAY 
EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 
EREEAFMALNLGVTSCQSLEISLDQFVRGEVIiEGSNAYYCEKCK 
EKRITVKRTCIKSLPSVLVIHLMRFGFDWBSGRSIKYDEQIRFP 
WMLNMEPYTVSGMARQDSSSEVGENGRSVDQGGGGSPRKKVAliT 
BNYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 
EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMLFY 
QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 
PHRPNNDRLS ILTKLVKKGEKKGLPVEKMPARIYQMVRDENLKF 
MKNRD\TYSSDYFSFVLSIiA5LNATKLKHPYYPCMAKVSLQLAIQ 
FLFQT YLRTKKKliRVDTEEW I ATTEALLS KS FDACQWLVE YFI S 
SEGRELIKIFLLECNVREVRVAVATILEKTLDSALFYQDKLKSL 
HQLLEVIiLALLDKDVPENCKNCAQYFFLFNTFVQKQGIRAGDLL 
LRHSALRHMI S FLLGASRQNNQ IRRWS S AQAREFGNLHNTVALL 
VLHSDVSSQRNVAPG I FKQRPP IS IAPSS PLLPLKEEVEALLFM 
S EGKPYLLEVMFALRELTGSLLALIEMVVYCCFCNEHFSFTMLH 
FIKNQLETAPPHELKMTFQIiLHEIIiVIEDPIQVERVKFVFBTEN 
GLLALM KHSNHVDS SRCYQCVKFLVTLAQKCPAAKE YFKENSHH 
WS WAVQ WLQKKMSEHYWTLQSNVSNETS TGKTFQRT ISAQDTLA 
YATALLNBKEQSGSSNGSESSPANENGDRHLQQGSESPMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPP66SVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAAD YNQALGTCRLAGTALCVAAGVLLAI CLFWAM IGWLS Q 
DTKAEPLDPEADSHVEVFGDBPEQQLSPIFRNASGQSWFSPPAS 
PPGQSSVQTIQPKRDS 


6866 


1571 


495 

■ 


D CPRPR YTLYGLRATCMRDi»DWAW INAVSAFKAiEQDLP VN I KF 
I X EGMEEAGS VALE ELVEKEKDRFFSGVDYI VI SDNLW I SQRKP 
AITYGTRGNSYFMVEVKCRDQDFHSGTFGGILHEPMADLVALLG 
5 LVDSSGHZ LVPGI YDEWPLTEEEINTYKAIHLDLEE YRNSSR 
VEKFLFDTKEEILMHLWRYPSLS IHGIEGAFDEPGTKTVIPGRV 
IGKFS I RLVPHMNVS AVEKQVTRHLEOVFSXRNS SNKMWSMTL 
GLHPWIANIDDTQYIiAAKRAIRTVFGTEPDH IRDGSTI P IAKMF 
QEIVHKSWLIPLGAVDDGEHSQNEKINRWNYIEGTKLFAAFFL 
EMAQLH 
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residue of 
amino acid 
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Amino acxd segment containing signal peptide 
(AoAlanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R«Arginine, 
S»Serine, T=Threonine, V-Valine, 
WoTryptophan, Y=Tyrosine, X-Unknovn, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


ot»b / 


2833 


1704 


QTRIMSQPKQKELAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP " 
LQS AE S S FTAG KKLP EV P P S EE EE Q EAWVNALLGR I FWDFLGEK 
YWSDLVSKKIOMKLSKJKLPYFMNELTLTBI^DMGVAVPKILOAF 
KPYVDHQGLWIDLE^4SYNGSFLMTLETKMNLTKLGKEPLVEALK 
VGB I GKEGCRPRAFCLADS DEES S SAGS SEED DAPEPSGGDKQL 
LPGAEGYVGGHRTSKIMRFVDKITKSKYFQKATETEPIKKKIBE \ 
VSNTP LLLTVEVQ ECRGTLAVNI P P PPTDRVW YGFRKPPHVELK 
ARPKLGERETVTLVHVTDWIEKXLEQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTS CLLKDPPVEAADQP 


6868 


1 


346 


RPTR PPTRPEB IKNLILP YI SDMNFVQDLCEDFYEtFKTDiCGFD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGRIVHLSNSFTQTVNCRKPFFSSW 


€869 


3 


1619 


MYMERMDKRALISFWESVEHLKNANKNEIPQLVGEIYQNFFVES 
KEISVEKSLYKEIQQCLVGNKGIEVFYKIOEDVYETLKDRYYPS 
FIVSDLYEKIjLIKEEEKHASQMISNKDEMGPRDEAGEEAVDDGT 
NQINEQAS FAVNKLRE LNE KLE YKRQALNS IQNAPKPDKKI VS K 
LKDEI ILIEKERTDLQLHMARTDWWCENLGMWKASITSGEVTEE 
NGEQLPCYFVMVSLQEVGGVETKNWTVPKRLSEFHNIjHRKLSEC 
VPSLKKDQLPSLSKLPFKS I DHTFMEKFEKQLNKFLQNLJjS DER 

LCQS EALYAFL s ps pdylkvidvqgkkns fslss flerlprdff 

SHQEEETEEDSDLSDYGDDVDGRKDALAEPCFMIjIGEI felrgm . 

fkwvrrtlialvqvtfgrt inkqirdtvswi fseqmlvyyini f 
rdafwpngklappttirskeqsqetkqraqqkllenipdmlqsl 
vgqqnarhgi iki fnalqetrankhllyalmelllielcpelrv 

HJjDQLKAGQV 


6870 


1 


1566 


MAAWAATRWWQLIAVLSAAGMGASGAPQPPNILLLLMDDMGWG 
DtiGVYGEPSRETPNIiDRMAASGLLFPNFYSANPLCSPSRAAIiLT 

grlpirngfyttnaharnaytpqeivggipdsbqllpellkkag 
yvskivgkwhlghrpqfhplkhgfdewfgspnchfgpydnkarp 
nipvyrdwemvgryyeefpinlktgeanltqiylqealdfikrq 
arhhpfflywavdathapvyaskpflgtsqrgrygdavrb 1 DOS 
I GKILELLQDLHVADNTFVFFTSDNGAAIjI s apeqggsngp flc 

gkqttfeggmrepalawwpghvtagqvshqlgs imdlfttslal 
agltppsdrai dgijmllptllqgrlmdrp I FY yrgdtiwaatlg 
qhkahfwtwtnsmenfrqgidfcpgqnvsgvtthnledhtklpl 

I PHIX3RDPGERFPLSFASAE YQEALSRITSWQQHQEALVPAQP 
QLRVCNWAVMNWAPPGCEKLGKCLTPPESIPKKCLWSH j 


6871 


209 


112* 


RMSLNPPIFLKRSBF^SSKFA^TKQSQTTSIASBDPLONtctAS 
QEVLQKAQQSGRSKCLKCGGSRMFYCYTCYVPVENVpiEQIPLV 
KLPLKIDI IKHPNBTDGKSTAIHAKLLAPEFVNI YTYPCI PE YE 
EKDHEVALIFPGPQSISIKDISFHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
D I LKEKYRGQYDNLLFF YS FMYQLI KNAKCSGDKBTGKLTH 


6872 


880 


459 " 


KKLITEyFVRQKYIiEYRRIPYTEPAEYEFLWGPRAFliETSKMLV 
LRFLAKLHKKDPQSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKKEDSHCIT 
IIHTEIFGFSNNYWELRRRRPKLKKLKKLIiMENPYEGPDSQKBK 
DSNSSKYTTEDLLDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
DYEMKLLNHVTQLVDSES WS FGKVPLNTCLQELGPLEPEEMIEH 
CLKCYGKKYVDEGEVYFELDADKICRAAARMLLQNAVKFNIAEF 
QEVWQQSVPEGMVTSLDQLKGLALVDRHSRPEIIFLLKVDDLPB 
DNQBRFNSLFSLREKWTEEDIAPYIQDLCGBKQTIGALLTKYSH 
S S MQNG VKVYNSRRP I S 
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y 4UU J. iiy 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Predicted encT 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=OVspartic Acid, B& 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
HcHistidine, l=lsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine,. R-Arginine, 
SeSerine, T=Threonine, V=Valine, 
VMryptophan, Y=Tyrosine, XoUnknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 


6874 


1 


307 


DS IADHVNSAAVNVEEGTKNLGKAAKYKLAALPVAGALIGGMVG 
GPIGLLAGFKVAGIAAALGGGVI^FTGGKLIQRKKQKMMEKLTS 
S CPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGERGNS AS E KWB I MFNEELGDP FI 1 1 HS I S LLNAE BHS IA 
TLLLRIEKBELDMKGSGFYVSLEWVTISKKNQDNKKYBIIKRDI 
LRG KS VPHYAAI EPDGNGLM I VS YKS LTFVQAGQDLBBNMDED I 
SEKIKEPLYYWQQTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VLKDHQFLEGKLYSSIDHESSTWIIKESNSLEISLIKKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEEIjNPNPDKEKPP 
CNAQELEECDI FFEESS SLCRFDGNTUCTTHWNLGSNQYLFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QAS KRDKKF FACAPN YS YAALCB CLRRVP I YRQPAPMSTVL YNR 
KEGRQVGQVAKQQVASLBTNDPI LGFQATNBRLFVLTTKNXiFLI 
KVNTEN 


6876 
c oWt 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSRTSVflOGS^" 

LHTKPRMPPCDFMPERYQVIFLVNSGSEANELAMLMARAHSNNI 

DI IS FRGAYHGCSPYTLGLTNVGI YKMELPGGTGCQPTMCPDVF 

RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 

VAKS IAGFFAEPIQGVNG WQYPKGFLKEAFELVRARGGVCIAN 

E VQTGFGRLGSHFWGFQTHDVLPDI VTMAKG I GNGFPMAAVITT 

PEIAKSLAKCIiQHFNTFGGNPMACAIGSAVLEViKEENLQENSQ 

EVGTYMLLKFAKIjRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 

PRBEVNQIHEDCKHMGLLVGRGSIFSQTFRIAPSMCITKPEVDF 

AVEVFRSALTQHMERRAK 


OB/7 


1 


778 


GTSPSPARAYAPPTERKRFYQNVSITQGEGGFEINLDHRKLKTP 
QAXLFTVPSEALAIAVATEWDSQQDTI KYYTMHLTTLCNTSLDN 
PTQRNKDQL I RAAVKFLDTDTI CYRVEEPETLVELQRNEWDP I 1 
EWAE KRYGVE 1 3 SSTS IMGPS I PAKTREVLVSHLAS VNTWALQG 
IEFVAAQLKSMVLTLGL IDLRLTVEQAVLLSRLEEE YQI QKWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLIiKE 


6873 


931 


263 


QT^DFK^RA^MiDFNlRIKNVTRSDAGKYRCEVSAPS^QGQN 
LEED TVTLEVL VAPAVP S CEVPS S AL S GTWELRCQ DKEGN PAP 
EYTWFKDGIRLLENPRLGSQSTNSSYTMNTKTGTLQFNTVSKLD 
TGE YS CEARNS VGYRRCPGKRMQVDDLNISG 1 1 AAWWALVI S 
VCGLGVCYAQRKGYFSKETS FQKSNS S S KATTMS ENDFKHTKS F 
II 


6879 


3 

< 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRVK 
KNSTTDQVYQAI AAKVGMDSTTVNYFALFEVI SHS FVRKLAPNE 
FPHKLY I QNYTSAVPGTCLT IRKWLFTTEBE I LLNDNDLAVTYF 
FHQAVDDVKKGYIKAEEKSYQLQKLYEQRKMVMYLNMLRTCEGY 
NE 1 1 FPHCACDS RRKGHVITAIS I THFKLHACTEEGQLENQVIA 
FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKI FTPYFNYMHE 
CFERVFCELKWRKEEY 


6860 


2110 


1437 


RKDNCTAKEWTFPEAKWNTTARVFSHIRLGMGHVLIIVQCFISS 
MANI YNEKILKEGNQLTES I F IQNSKLYFFG I LFNGLTLGLQRS 
NRDQIKNCGFFYGHRAFSVALIFVTAFQGLSVAFILKFLDNMFH 
VU4AQVTTVI ITTVSVLVFDFRPSLEFFLEAPSVLIiSIFlYNAS 
KPQVP E YAPRQER I RDLSGNLWERSS GDGEELERLTKPKSDE S D 
EDTF 


6881 


2638 


2244 


NDSKWEDIHVI'i-GALKMFFRfeLPEPLFTFNHFNDFVNAiKQEPR 
QRVAAVKDLI RQLPKPNQDTMQI L FRHLRRV I ENGEKNRMT YQS 
IAIVFGPTIiLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNM VTARQBPRLVLISLTCDGDTLTLSAAYTKDLLLP I KTPT 
TNAVH KCR VHG LE I EGR DCG EATAQW ITS FLKS Q P YRLVH FE PH 
MRPRRPHQ I ADLFRP KDQ IAYSDTS P FLI LS EAS LADLNS RL E K 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C-Cysteine, D-Aspartic Acid, B* 
Glutamic Acid, F«Phenyl alanine, G=»Glycine, 
H«Histidine, I*Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutaraine, R^Arginine, 
S=Serine, T=*Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X-Unknown, *=»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATNPRPNIVISGCDVyAEDSWDELLIGDVELKRVMACSRCI 
LTTVDPDTGVMSRKEPLETLKS YRQCDPSERKLYGKS PLFGQYF 
VLENPGTI KVGDPVYLLGQ 


6883 


2794 


2256 


NSKLKLNQNLKLPITLTYQVLSLHGWGPGIHLQKEGAFPVTQNR 
ALQ LLYDLR YLNI VLTAKGDEVKS GR£ KPBS R I EKVTDHL EAL I 
D PPDLDVPTPHLNSNLHRl*VQRTSVIiFGLVTG TENQLAP RS STF 
NSQEPHNILPLASSQIRFGIjLPLSMTSTRKAXSTRNIBTKAQYD 
ANC 


6884 


2 


99 


BFERVTAEAVKPRETSEPRAAAQRFCEKFPFL 


6885 


297 


1554 


stgqfwhvtdlhldptyhitddhtkvcasskganasnpgpfgdv 
lcdspyqlilsafdfiknsgqeasfmiwtgdspphvpvpelstd 
tvinvitnmtttiqslfpnlqvfpalgnhdywpqdqlswtskv 

YNAVANLWKPWLDBEAISTLRKGGFYSQKVTTOT'NLRIISIiOT™ 

lyygpnimtlnktdpanqfewlestlnnsqqnkekvyi iahvpv 
gylpssqnitamreyyneklidifqkysdviagqfyghthrdsi 
mvlsdkkgspwslfvapavtpvk5vlekqtnnpgirlfqydpr 
DYKLLDMLQYYLNLTEANLKGBS I wkle yiltqtydi edlqpbs 
lyglakqftildskqfikyynyffvsydssvtcdktckafqica 
imnldnisyadclkqlyikhny 


6886 


2 


1341 


qcggipgreggssrpleegtgsspacvrgaapgsedafyptrak 
qarvsqeijckaajcrtvsisegpdtlgdgmrerretlalapepkp 
i^keacekwkrpfrsasatsltlshcvdvvkglldfkkrrghsi 
ggapeqryqi 1 pvcvaarlptraqdvldahlse vnavr fgpnss 
llatggadpjjihlwnwgsrleanqtlegaggsitsvdpdpsgy 
q viaat ynqaaqlwkvgeaqs ketlsgh kdkvtaakfkltrhqa 
vtgsrdrtvkew0lgraycsrtinvls ycndwcgdhii isghn 
dqki rfwd srg phctqvi pvqgrvtslslshdqlhllscsrdnt 

LKVIDLRVSNIRQVFRADGFKCGSDWTKAVFSPDRSYALAGSCD 
GALY1WDVI^KLESRIjQGPHCAAVNAVAWCYSGSHN3VSVDQGR 
KWLWQ 


6887 


1047 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLG(SGCPRGPWGPQHG 

gqqraagptlprgerggpqqsgpgiaaqtpptskqyawrafltg 
tyrsqs prspagpfrggtgwwpepavclcvavgpqrls spglvy 
nasgsehcydiyrlyhscadptgcgtgpdarawdyqacteinlt 
fasnnvtdmfpdlpftdelrqrycldtwgvwprpdwlltsfwgg 
dlraas n i 1 fsnqnld p waggg i rrnls as v i avti qggahhxid 
lrash p e dpaswearkleat i igewvkaarreqqpalrggprl 

SIi 


6888 


1 


992 


F VAYVKKB t P^Wtt WTHCLLNPHAlLVI K^LP^k^PJiAliFT VVRVI 
KFIKGRAPNHRLFQAFFEEIGIEYSVLLFHTEMRWIiSRGQILTH 
I FEMYEE INQFLHHKSSNLVDGFENKEFKIHIiAYLADLFKHLNE 
LS AS MQRTGMNTVSARBKLS AFVRKFP FWQKR IE KRN FTNFPFL 
EEIIVSDNEGIFIAAEITLHLQQLSNFFHGYFSIGDLNEASKWI 
LDPFLFNIDFVDDSYLMKNDLAELRASGQILP1EFETMKLEDFWC 
AQFTAFPNLAKTALE I LMP FATT YLCELGFS ITFTPQNKVPEAA 
LILSODXRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLENQ I KE ERKQDNS BS PNGRTS PLVSQNNEQGSTLRDLLTTT 
AGKLRVGSTDAGIAFAPVYSMGAPSSKSGRTMPNILDD I IASW 
BNKIPPSKTSKINVKPELKEEPEESIISAVDENNKLYSDIPHSW 
ICEKHILWLKDYKNSSNWKLFKECWKQGQPAWSGVHKKMNISL 
WKAE S I S LD FGDHQ ADLLNCKD S IIS NANVKE FWDG F E BVS KRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLAS HL PG FFVRP DLGPRLC S AYGWAAKDHD I GTTNLH I 
EVSDWN ILVYVG IAKGNG ILS KAG I LKKFE E EDLDD I LRKRLK 
DSSEIPGALWHIYAGKDVDKIRBFLQKI3KBQGLEVLPEHDPIR 
DQS WYVNK1CLRQRLLBEYGVRTWTLIQFLGDAIVLPAGALHQVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
HsHietidine, I=Isoleucine, K^Lysine, 
L=Leucine, M«Methionine, NfaAsparagine, 
P«Proline, Q=Glutaraine , R=Arginine, 
S^Serine, T=Threonine , V«Valine, 
W=Tryptophan, Y*»Tyrosine, X=unknovn, *«»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCIQVTEDFVSPEHLVESFHLTQELRLIjKBEINYDDKLQVK 
NILYHAVKEMVRAX.KIHEDEVDDMEEN 


6890 


3 


667 


THACGMW I PLYLHRALWHKTAETCNS PPCGAKDSLI PGAIT^F 
TGFLGVDTGAGATRWCRLKTQRADPLVCAVGMLGSAI FICLI FV 
AAKS S I VGAY I CIFVGETIiLFSNWAITAD I LM YWI PTRRATAV 
ALQS FTSHLLGDAGS PYLIGPI SDLIRQSTKDSPLWEPLSLGYA 
LMLCPFVWLGGMFFLATALFFVSDRARAEQQVNQLAMPPASVK 
V 


6891 


1980 


1262 


lrihqellskelkllrgitiesiihiglaagkeqfmOdasnvmO " 
lllktqshlynmednnp evrqaaa yglgvmaqpggdd yrslcs e 

AVPLLVKVI KRAHSKTKXNVlATENClSAIGKIIiKPKPNCVNVD 
EVLPHWLSWLPLHEDKEEAIQTLSFLCDLIESNHPW1GPNNSN 
LPKI I SI IAEGKINETINYEDPCAKRLANVVRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


6 892 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV 
FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVAIELWKAVKRHNLTKRWLMKI VDEREKNLDDKAYRNI KELE 
KYAENTQSS LLYLTLE I LG I KDLHADHAASH I GKAQG IVTCLRA 
TPYHGS RRKVFLPMDI CMLHGVSQEDFLRRNQDKNVRDV I YD I A 
SQAHLHLKHARSFHKTVPVKAFPAFIjQTVSLEDFLKKIQRVDPD 
IFHPSLQQKNTLLPI.YLYIQSWRKrY 


4893 


1 


842 


DGERKSMSVERTFSEINKAEEQYSLCQELCSEIjAQDLQKKRLKG 

rtvt i klknvnfevktrast vssvvs taeei faiakellkteid 
adfphplrlrlmgvrissfpneedrkhqqrsiigflqagnoals 

ATECTLEKTDKDKFVKPLEMSHKKS FFDKKRSER KWSHQDTFKC 
EAVNKQSFQTSQPFQVLKKKMNENLE ISBNSDDOQILTCPVCFR 
AQGCIS LEALNKHVDECLDGPS ISENFXMFSCSHVSATKVNKKE 
NVP ASS LCEKQD YEAH 


6894 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEQLPHWRG ' 

DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 

VFNPW 


6895 


2379 


478 


VTYVBteDtASPTAt^^MRTVLDtlVEbLQSTSEDKEQQYTSQT ' 
TRLLALL YALASHKACKLAI LHL I NGT I KGDERYAE I FQDLIiAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 
EQLSNSLPNKELMTS ICDCLLATLANSESSYNCLIiTCVRTMMFL 
AEHDYGLFHLKSSLRKNSSALHSLLKRWSTFSKDTGELASSFL 
E FMRQ I LMSDT IGCCGDDNGIiMEVEGAHTSRTMS INAAELKQLL 
QSKEESPENLFLELEKLVLEHSKDDDNLDSLLDSWGLKQMLES 
SGDPLPLSDQDVEPVLSAPESLQNC»FNNRTAYVLADVMDDQIiKS 
MWFTPFQAEEIDTDLDLVKVDLIELSEKCCSDFDLHSELERSFli 
S EPSS PGRTKTTKGFKLGKHKHETF I TSSGKS E Y I EPAKRAHW 
P PPRGRG RGG FGQG IR PHDI FRQRKQNTSRPPSMHVDDFVAAES 
KEWPQDGIPPPKRPLKVSQKISSRGGFSGNRGGRGAFHSQNRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 

KFVSGGSGRGRHVRSFTR 


6696 


1 


555 


GN I VI QKKKYNKQH 1 i PLENVT I DS I RDEGDLRNG 1*1 KTPTK8 
FAVYAATATEKSEWMNHINKCVTDLLSKSGKTPSNEHAAVWVPD 
SBATVCMRCQKAKFTPVNRRHHCRKCGFWCGPCSBKRFLLPSQ 
S S KPVR I CDFCYDLLS AGDMATCQPARSDS YSQSLKS PLNDMS D 
DDDDDDSSD 


6897 


3 


920 


GDGIJflHEVVNGI^RPDWETAIQKPLCSLPAdSGNAIJ^SUNHY " 

AGYEQVTNEDLLTNCTLLLCRRLLSPMNLLSLHTASGLRLFSVL 

SLAWGPlADVDLESEKYRRLGEMRFTl/STFLRLAAIjRTYRGRLA 

YLPVGRVGSKTPASPVVVQQGPVDAHLVPLEEPVPSHWTVVPDE 

DFVLVLALLHSHLGSEMFAAPMGRCAAGVKHLFYVPJVGVSRAML 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seywexit containing signal peptide 
(AoAlanine, (^Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, FoPhenylalanine, G=Glycine, 
KoHistidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
t-n.uii.ucj y-uiutaimne, K s iurgimne, 
SsSerine, T°Threonine, V=Valine, 
W*Tryptophan, Y= Tyrosine, X»Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYLVYVPWAFRLEPKDGKGVFAVDGE 


6898 


919 


346 


QKTVTAVASLLKGRQGIYTENERRMGAVIKIRFFKIMLVLIICW 
LSNI INESLLFYXBMQTDINGGSLKPVRTAAKTTWFIMGILNPA 
QGFLLS LAFYGWTGCSLGFQS PRKB I QWES LTTSAAEGAHPS PL 

MDUPMOR0rVT70rttffTAfnnr\Bi> t e*ut-r nn/i<iiMLnm^n*..». 

Mi^ttcwFitoQyAV&QVG{^TSDEAI*SMLSEGSDAS 
NKNEGDFALPTHGDL 


6899 


120 


827 


M KVR KNNDAYL LDKNKINM DCF IS C FFKKMLTTLM FSHSG I LS L 
LEHGEEYTFSLPCAYARS I LTVPWVELGGKVS VNCAKTGYSAS I 
TFHTKPFYGGKLHRVTAEVBGINITNTWCRVQGEWNSVLEFTYS 
NGETKYVDLTKlAVTKKRVRPLEKQDPFESRRLMKNVTDSliRES 
D1UAA 1 i Liij.fc.KyK 1 ttiiKHKiivrGTPWKTKY FIKEGDGWVY 
HKPLWKI I PTTQPAE 


6900 


3 


4S1 


TEVIXSSKGIHELRSSTSAI^HALBESASLLTMFWRAALPSTHIP 
VLPGKVGESTERBLLELRTKVSQQEQliLQSTTEHLKNANQQKES 
MEQFIVSQLTRTHDVLKKARTNLEVRKLLHQSEAPSLSPTHHHP 
LRDLVGDSWPALRFQEK 


*901 


1 


201 


DDNM VQRLE TD FiO^lliOQQS TLEQWAAWtiDNVMMQAL KP YEGRP 
SFPKAARQFLLKWSFYRYHLGFS 


£902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSALEGQAGAQGA 
SDMPEPSLDLLPELiTNPDELLSYLDPPDLiPSNSNDDLLSLFENN 


6903 


1 


149 


RINQVYRQGPTG IHI LVI DQWVQNFQDB5CFLF^TVKAES^5bG^ 
HULK 


6904 


464 


2092 


MEASL P VSLS CVLACGDVEGKFDILFNRVQAI Q KKSGNFDLLLC " 
VGNFFGSTQDAEWEEYKTGI KKAPIQTYVLGANNQETVKYFQDA 
DGCELAENITYLGRKGIFTGSSGLQIVYLSGTESLNEPVPGYSF 
SPKDVS SLRMMLCTTSQFKGVDI LLTSPWPKCVGNFGNSSGEVD 
TKKCG S ALVSS LATG LKP R YHFAALE KT YYERL P YRNH 1 1 LQE N 
AQHATR FIALANVGNPEKKKYL YAFS I VPMKLMDAAELVKQP P D 
VTENP YRKSGQEAS I GKQI LAPVEESACQFFFDLNEKQGRKRSS 
TGRDS KSS PHPKQPRKPPQPPGPCWFCIASPEVEKHL VVNIGTH 

V» X 4-irU^AA.urV7lJ01^JJxlVljXljJr XLrHXU^ VvbijSABVVEEVBKYKATL 

RRFFKSRGKWCWFERNYKSHHLQLQVI PVPISCSTTDDIKDAF ' 
ITQAQEQQIELLEIPEHSDIKQIAQPGAAYFYVELDTGEKLFHR 
IKKNFPLQFGREVLASE^ILNVPDKSDWRQCQISKEDEETLARR 
FRKDFE P YD FTLDD 


6905 


1 


226 


VSKTGEAETITSHYLFAI^VYRI^YLFNWIWRVHFEGFFfiElAI - 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 




SYDDHNGHIDFITAASNLRAKMYSIEPADRFKTKRIAGKIIPAI 
ATTT ATVSGLVALEM IKVTGG YP FEAYKNWFLNLAI P IWFTET 
TEVRKTKIRNGISFTIWDRWTVHGKEDFTLLDFINAVKEKYGIE 
PTMWOGVKMLYVPVMPGHAKKLKLTMHKLVKPTTEKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYFSHDTD 


6907 


2 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIMSRRSQRtiTRYSQGDDDGS ~ 
S S SGGS S VAGSQS TLFKDS PLRTLKRKS SNMKRLS PAPQLGPS S 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRXATEDFLGSSSGYSSEDDYVGYSDVDQQSS 
SSRLRSAVSRAGSLLWMVATSPGRLiFRLLYWWAGTTWYRLTTAA 
SLLDVFVLTRRFS S LKTFLW FLL PLLLLTCLTYGAW Y FYPYGLQ 
TFHPALVS WWAAKDSRRADEGWEARDSS PHFQAEQRVMSRVHSL 
ERRLBALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHBD 
TLALLEGLVSRREAAL KED FRR E TAARIQEE LS ALRAEHQ QDS E 
DL FKKI VRASQES EARI QQLKS EWQS MTQES FQES S VKELRRLE 
DQLAGLQQELAALALKQSSVAEEVGLLPQQIQAVRDDVESQFPA 
WISQFLARGGGGRVGLLQREEMQAQLRBLESKILTHVAEMQGKS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=aHistidine, I=Isoleucine, KaLysine, | 
LaLeucine, M=Methionine, N«Asparagine , 
P=Proline, Q^Glutamine, R^Arginine, 
ScSerine, T*Threonine, VoValine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








AREAAASLSLTLQKEGVIGVTEEQVHHIVKQAliQRYSEDRIGLA 
D YALESGGAS VI STRCS ETYETKTALLSLFG 1 PLWYHSQSPRVI 
LQPDVHPGNCWAFQGPQGFAVVRLSARIRPTAVTLBHVPKALSP 
NSTISSAPKDFAIFGFDEDLQQEGTLLGKFTYDQDGEPIQTPHF 
QAPTMATYQWE LRILTNWGHPEYTCI YRFRVHGEPAH 


6908 


3 


780 


QVPSAAWLMAVCGLGSRLGLGSRLGLQGCFGAARIjLYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALBLVTEKGHTFAEELQKIQCTLQDV 
GSAIATPCSSAREAHLKYTTFKAGPILELBQWIDKYTSQLPPLT 
AF I LPS GGKI S S ALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDLYGQRSSAPEQEIiLVQDATPVSNSLLPEKAFSblP 
S P YLRGTI KMMQAVRQAFQDQDDRRTWDGRPLTMAATFDDCLYA 
LCVVDTIKRSSQTGEWQNIAIMTEE PELS PAYLISEAMRRS RMS 
LYC 


6910 


1 


10^8 


LVPVWIDSTOGKiViAPLMlVLYNIFTPHGPDLYGTEPWYFY 
L I NG FLNFNVAFALALLVLPLTS LME YLLQRFHVQNLGHP YWLT 
LAPMYI WFI I FFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWLALGTVFLFGLLSFSRSVA 
LFRG YHGPLDL YPE FYR I ATDPT IHTVPEGRP VNVCVGKE W YRF 
PSSFLLPDNWQLQFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 
NLEEPSRYIDI SKCHYLVDLDTMRETPREPKYSSNKEBWI SLAY 
RPFLDASRSS KLLRAF YVP FLSDQYTVYVNYT I LKPRKAKQIRK 
KSGG 


6911 


1164 


9££ 


GEDAEEMETGNVANLIS I FGSSFSGLLRKSPGGGREBEEGEESG 
PEAAEPGQ I CCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 


AMKP VETHS FQMLFT I LS TGS ALKAQS YEDAYRCI KSS I LLGS I 
SGGTDIISCFMGHNF5LPVYKGEIQARNLGMAVEAWNEEGKAVW 
GESGELVCTKP I PCQPTHFWNDENGNKYRKAYFS KFPGI WAHGD 
YCRINPKTGGIVMLGRSDGTIiJPNGVRFGSSElYNIVESFEEVE 
DSLCVPQ YNKYREERV I L FLKMASGHAFQ PDL VKR I RDAI RMGL 
S ARHVPSL I LBTKGI P YTLNGKKVEVAVKQI IAGKAVEQGGAFS 
NPETLDLYRDIPELQGF 


6913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSKfEEDlLSMFSAIRSQHSGVDI 
CINNAGLARPDTLLSGSTSGWKDMFNVNVLALS I CTREAYQS MK 
ERNVDDGHI INI NSMSGHRVLPLSVTHFYSATKYAVTALTEGLR 
QELREAQTHIRATCI S PG WETQFAFKLHDKDPEKAAAT YEQMK 
CLKPEDVAEAVI YVLSTPAH I Q IGD I QMRPTEQVT 


6915 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHVVAISFT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAF I TTVTFLW XVSAI TWS C LPL YVLKYLRRKLS P P S YCKLAS 


£916'- 


254 


6S2 


GRSLS FKTFLI WVLIS I YQGGILMYGALVLFESEFVHWAI S FT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVS S LAFLNEYFD 
VAFI TTVTFL WKVS AITWS CLPLYVLKYLRRKLS P PS YCKLAS 


6917 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLMVALTVRTWHWLM WAEFLSLGCYVSS LAFLNEYFD 
VAFITTVTFLWKVSAITVVS CLPLYVLKYLRRKLS PPS YCKLAS 


6918 


28 


921 


PEAGTRS WREPDP BDLRRFLLSAACRS FPQWLPGGGGGQVS S CS 
DTDVP YLLLAVKS EPGRFAERQAVRETWGSPAPGI RLLFLLGSP 
VGEAGPDLDS LVAWE S RRYS DLLLWD FLDVPFNQTLKDLLLLAW 
LGRHCPTVS FVLRAQDDAFVHTPALLAHLRAL P PAS ARS LYLGE 
VFTQAMPLRKPGGPFYVPES FFEGGYPAYASGGGY VIAGRLAPW 
LLRAAARVAPFPFEDVYTGLCIRALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPQAS IRLWKQLQDPRLQC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=»Lysine, 
L=Iieucine, Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TaThreonine, V- Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6919 


850 


41 


QGRRELSGSVFCPFIQQEPKEMLTLSEYHERVRSQGQQLQQLQA " 

ELDKLHKEVSTVRAANSERVAKLVPQRLNEDFVRKPDYALSSVG 

ASIDLQKTSHDYADRNTAYFWNRFSFWNYARPPTVILEPHVFPG 

NCWAFEGDQGQWI QLPGRVQLS DITLQHPP PS VEHTGGANSAP 

RDFAVpFLLSFFTHQGLQVYDETEVSLGKFTFDVEKSEIQTFHL 

QNDPPAAFPKVKIQILSNWGHPRFTCLYRVRAHGVRTSEGAEGS 

AQGPH 


6920 


1418 


591 


EAQG PS KVHLTLKKKK 


6921 


2 


1711 


mnatrseeqfhvinhaeqtlrkmenylkEkqlcdvlliaghlri 
pahrlvlsavsdyfaamftndvleakqeevrmegvdpnalnslv 

QYAYTGVLQLKEDTIESLLAAACLLQLTQVIDVCSNFIiIKQLHP 
SNCLGIR5FGDA{^CTELLNVAHKrTMEHFIEVrKNQEFLLLPA 
NBI SKLLCSDDINVPDBETI FHALMQWVGHDVQNRQGELGMLLS 
YIRLPLLPPQLLAOLETSSMFTGDLECQKLLMEAMKYHLLPERR 
SMMQS PRTKP RKSTVGALYAVGGMDAMKGTTT I EKVDLRTNSWL 
HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 
I WTVMPPMS THRHGLGVATLEGPM YAVGGHDG WS YLNTVERWD P 
EGRQWmrVASMSTPRSTVGVVALNNXLYAlGGRDGSSCIjKSMEY 
FD PHTNKWS LCAPMSKRRGGVGVAT YNG FLY WGGHDAPASNHC 
SR I*S DCVBR YDP KGDS WSTVAPLS VPRDAVAVCPLGDKLYVVGG 
YJX?HTYLNTVESYMQRNEWKEEVPVNIGRAGACWWKIiP 


6922 


1075 


369 


LTPPAGIRHEVRDRBRERERERKREKFPLDSTGfiELKQNIH&iT 
GLP PAMQKVM YKGLAPEDKTLRE I KVTSG AK I MGGGS TINDVLA 
VNTPKDAAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 
KGAQERLPTVPLSGMYWKSGGKVRLTFKLEQDQLWIGTKERTBK 
LPMGS I KNWS EP I EGHEDYHMMAFQIiGPTEAS YYWVYWVPTQ Y 
VDAI KDTVIiGKWQYF 


6923 


2469 


1660 


LGL F C I L P I DTL CAVLE RDTL S I RE SRLFGA WRWAEAE CQRQQ " 
LPVTFGNKQKVLGKALSLIRFPLMTI EEFAAGPAQSGI LSDREV 
VNLFLHFTVNPKPRVE Y I DR PRCCLRGKECCINRFQQVESRWG Y 
SGTSDRIRFTVNRRISIVGFGLYGS IHGPTDYQVNIQI IEYEKK 
QTLGQNDTGFS CDGTANTFRVMFKEP IEI LPN VC YTACATLKGP 
D SH YGTKGLKKWHET PAAS KTVFFFFSS PGNNNGTS I EDGQ I P 
EIIFYT 


6924 


2210 


1235 


PEERVT CFVE Y YLTAFHEGRKGAliAKKP YNP 1 1 GETFHCS WEVP " 

KDRVKPKRTASRSPASCHEHPMADDPSKSYKLRFVAEQVSHHPP 

ISCFYCECEEKRLCVNTHVWTKSKFMGMSVGVSMIGEGVLRLLE 

HGEEYVFTLPS AYARS ILTI PWVELGGKVSINCAKTG YSATVI F 

HTKPFYGGKVHRVTAEVKHNPTNTIVCKAHGEWNGTriEFTYNNG 

ETKVlDTTTLPVYPKKIRPLEXQGPMESRNLWREVTRYLRIiGDI 

DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 

SPLESTLMGLEVQSFPV 


6925 


2 


1653 


rggaagaambpdsviedktIelmcsvprslwlgcanlvesmcAl 

SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
uy w&ius DQ VEFVEHIjI SRMCHYQHGH INS YLKPMLQRDFITALP 
EQGLDH I AENIL S YLDARS LCAAELVC KEWQRV I S EGMLWKKL I 
ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNSFYRSLYPKI 
IQOIETIESNWRCGRHNLQR IQCRSENSKGVYCLQYDDEKI ISG 
LRDNS I KIWDKTSLECLKVLTGHTGS VLCLQYDERVI VTGSSDS 
TVRVWDVNTGEVLNTLIHHNEAVLHLR FSNGLMVTCS KDRS IAV 
WDMAS ATDITLRRVLVGHRAAVNVVDFDDKYI VSASGDRTI KVW 
S TSTCE FVRTLNGHKRGIACLQYRDRLWSGSSDNTIRLWD IEC 
GACLRVLEGHE ELVRC IR FDNKRI VSG AYDGKI KVWDLQAALD P 
RAPAS TliCLRTLVEHSGRVFRIiQFDEFQI IS SSHDDTI L I WDFL 
KVPPSAQNETRS PSRTYTYISR 


] 6926 


1 


733 


SGRVAMDGLGLQ FPEQGFPAGPPLLP PHMGGH YRDCQSLGAP PL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

co rr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A»Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine , G=Glycine, 
H»Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagiae , 
P=Proline, Q=Glutamine, R«Arginine, 
SoSerine, T-Threonine , V=Valine, 
W^Tryptophan, Y^Tyrosine, X»Unknown, *«Stcp 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








dgypLptpdtspldgvdpdpaffaapmpgdcpaagtysyaqvsd 

YAGP PEP PAG PMHPRLG PEPAG PS I PGLLAP PS ALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 
DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHGAISSWSDASSAVYYCNYPDV 


COOT 


2 


1484 


LTLCGDIQLMLAQNANNRAAHLEEFHYQTXEDQEILHSLHRESS 
CQGFAWATDI*STDLESQZjSVSCXCYEAANBILQFRDLKSQNPBH 
YVQVLKRMGN I RNE IG VFYMNQAAALQSERLVS KS VSAAEQQLW 
KKS FSCFEKGIHNFES I BDATNAALLLCNTGRLMRlCAQAHCGA 
GDELKRBFSPEEGLYYNKAIDYYLKALRSLGTRDIHPAVWDSVN 
WELSTTYFTMATLQQDYAPLSRKAQEQIEKEVSEAMMKSLKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLHYSKAAKLFQLLKDAPCELLRVQLERVAFAEFQMTSQNS 
NVGKIiKTLSGALDIMVRTEHAFQLIQKELlEEFGQPKSGDAAAA 
ADASPSLNREEVMKLLS IFESRLSFLLLQS I KLLSSTKKKTSNN 
IEDDTILKTNKHIYSQLLRATANKTATLLERINVIVHLLGQLAA 
GSAASSNAVQ 


6928 


1086 


777 


EAIDLINNljliQVKMRKRYSVDKTLSHPWLQDYQTWLDLRELECK " 
IGERYITHES DDLRWEKYAGEQGLQYPTHLINPSASHSDTP ETE 
ETEMKALGBRVSIL 


*929 


1749 


607 


RDQRGYRDDRSPAREPGDVSARTRSGGGGGRSATTAMPPPVPNG ' 
NLHQHDPQDLRHNGNVVVAGRPSCS RGPRRAI Q KPQ P AGGRRS G 
RGPAAGGLCLQPPDGGTCVPEEPPVP PMDWEALEKHLAGLQFRE 
QEVRNQGQARTNSTSAQKNERES IRQKLALGS FFDDGPG I YTS C 
SKSGKPSLSSRLQSGMNI/3ICFVNDSGSDKDSDADDSKTETSIiD 
TPLS PMSKQSSS YSDRDTTEEESESLDDMDFLTRQKKLQAEAKM 
ALAMAKPMAKMQ VEVEKQNRKKS P VADLLPHMPH I S ECLMKRS L 
KPT0LRDMTIGQLQVI VKDIiHS QIESLNEBLVQLIiLIRDBLHTE 
QDAMLVD I EDLTRHAESQQKHMAEKMPAK 


6930 


131 


545 


fkdtanvfvslfqmri^frhyfiepsolkL^dvitwiVtqvai 

S YTWPFVLLS I KPSLTFYS S WYYCLHILGILVLLLLP VKKTQR 

RKNTHENIQLSQSKKFDEGENSIX3QNSFSTTNNVCNQNQEIASR 
HSSLKQ 


6931 


2 


659 


FVERLPNRPACLLVASGAAEGVSAQSFIiHCFTMASTAFNLQVAT 
PGGKAMEFVDVTE SNARWVQDFRLKAYASPAKIJE5 IDGAR YHAL 
LI PSCPGALTDLASSGSLAR I LQHFHS ES KP I CAVG HGVAALCC 
ATNEDRS WVFDS YSLTGP SVCELVRAPGFARLPLVVEDFVKDSG 
ACFSASE PDA VHWLDRHLVTGQMASSTVPAVQWLL FLCGSRK 


6932 


2 


1131 


FVDS PGQGEQAEEEEGGIQMNSRMRAHS PAEGAS VES S S PGP KK ' 

SDMCEGCRSLAAGHPGYISHDKETSIKYVSHQHPSHPQLFSIVR 

QACVRSLS CEVCPGREGPIFFGDEQHGFVFSHTFF I KDSLARGF 

QR WYSI I TIMMDRI YLINS WPFLfcGKVRGI IDELQGKALKVFEA 

EQFGCPQRAQRMNTAFTPFLHQRNCaJAARSLTSLTSDDNLWACL 

HTSFAWLLKACGSRLTEKLLEGAPTEDTLVQMEKLADLEEESES 

WDNSEAEEEEKAPVLPESTEGRRI.TriGPAP«3QQT.Qr3pr»QunDTJir 

LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTL 
KTLQEVTDS LLGGWLMAQGVGGI I 


6933 


1431 


890 


SLNLHCTLPPPPHQYPAGYPSDKEGKKPKGQSKKQPSGTTKRPI 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHIjIREDCQNQKLW 
DE VLSHIiVEGPNFLKKLEQS FMC VCCQELVYQ PUTTECFHNVCK 
DCLQRS FKAQVFSCPACRHDLGQH YIM I PNE I LQTTLLDLFFPGY 
SKGR 


6934 


3030 


2*88 


DRDHSQCGGIRRVALARVSSVKLISKAKIRTVKMTFI I VtiAFIV " 
CWTP FFFVQMWS VWDANAPKBASAFI IVMLIASLNSCCNPWIYM 
LPTGHLFHELVQRFLCCSASYLKGRRLGETSASKKSNSSSFVLS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


AU111.11U ana Begiucnt couwaiiunc[ sxgnax pspcxcie 
(A-Alanine, c-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Pfaenylalanine, G=Glycine, 
HaHistidine, Ioisoleucine, KaLysine, 
L»Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutaraine, R=Arginine, 

%j— ilex J. lie , 1 a l lu cOIilllc , vsvdnne, 

W«Tryptophan, YaTyrosine, X=Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


£935 


886 


543 


NSALYVAGGNDGTSCLNSVBRYSPKAGAWESVAPMNIRRSTHDIj 
VAM DG WLYAVGGND GS S S LNS I EKYNPRTNKWVAAS CMFTRRS S 

VftUttVT.PT .I.MffDDDO CDTT.C VC QTOT. 

v v v /\ v uciuuiMr ** Ff o o \r 1 Lt& V £>o 1 oJj 


6936 


1347 


567 


RSHRRQFLSRALLfePfGKSHPPPHRLPRKSLNVGLHYSHIPFLT 
TCIjHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
ME KRLQEAQLYKEEGNQRYREGK YRDAVS RYHRALLQLRGLDPS 
LPSPL PNLG PQGPALTPEQEM ILHTTQTDCYNNLAACbLQME P V 
NYERVREYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYIiL 
vjn KWr AJJANVKK i IiQuIuo BIjS 5 YnRKEiCQLYIiGMFG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCdPbY 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PCPPLEERAGCLEYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TSPHWSTHT3DAGYCMBFKTESLTPHCALENRPLTRWMQYLREG 
YWCVDCQPPAMNSVSLRCSGDGLDSDGNQrLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938"" 


«» 
j 


719 


NSRKLEIiAERVDTDFMQ LKKRRQSS EKBNDSGTLDT VGAVWDH 
EGNVAAAVS SGGLALKH PGR VGQAALYGCGCWAENTGAHNP YS T 
AVS TSGCGEHLVRT I LARE CSHALQ AEDAHQ ALLE TMQNKF I S S 
P FLAS EDG VLGGVI VLRS CRCSAEP DS S QNKQTLLVE FLWSHTT 
ESMCVGYMSAQDGKAKTHI SRLPPGAVAGQS VAIEGGVCRLGEP 
SELTLQAECEASQRHFRT 


"6939 




OlQ 


JtVTAfKKfQKrsSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS 
GYESLRRDSEATGSASSAPDSMSESGAASPGARTRSLKSPKKRA 
TGLQRRRU PAPLPDTTALGRKPSLPGQWVDLPPPLAGSLKEPF 
E I KVYE I D0VERLQR PRPT PREAPTQGLACVS TRLRLAERRQQR 
LREVQAK^KHLCEELAETG^RLMLEPGRWLEQFEVDPELEPESA 
EYLAALERATAAI^EQCVNLCKAHVMMVTCFDISVAASAAI PGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 
V v j^vjoo r LA* rJ\\jXJjit\£iti4 Jt'/ViJj f v Ju o Wljy P EKCAVKQCAQ CHA V 
LADSVHLAWDLSRSLGAWFSRVTNNWLKAPFLVGIEGSLKGS 
TYNLLFCGSCGI PVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELXEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI IGSNVLALAEAQRQAEALG YQA 

V V Jj j/VU IUuw V AOl'Irv^ Xr IVIULvUIVAKI Ki-I J Ir &NAl3AoVj£iSUAuI' 

HELAAELQI PDLQLEEAIjETMAWGRGPVCLIAGGEPTVQLQGSG 
RGGRNQELALRVGAELRRW PLGPI DVLFLSGGTDGQDG PTEAAG 
AWVTPBLASQAAAEGLDIATFLAHNDSHTFFCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 




1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCIiIjGDRLYADGGYDG 
OTYLNTMESYDPO/IWEWTQMASLNIGRAGACVVVIKQP 


6943 


1 


739 


PMATGDGAKTLAIHVKALTADS I RITWKATLPASS FRLSWLRLG 
HSPAGGSITETLVQGBKTEYLLTALEPKPTYI ICMVTMETTNAY 
VADETPVCAKAETADS YGPTTTLNQEQNAGPMASLPLAGI IGGA 
VALVFLFLVLGAI CWYVHQAGELLTRERAYNRGSRKKDDYMESG 
TKKDNS ILE IRG PGLQMLP INPYRAKEEYWHTI FPSKGSSLCK 
ATHTIGYGTTRGYRDGGIPDIDYSYT 


6944 


960 • 


154 


VAN I LLNGVK YE S BLTGS SERAEQPLS VGRIiCST I CNM PKAliRT 
LCVNH FLGWLS FEGMLLF YTDFMGE WPQGDPKAPHTSEA YQKY 
NSGVTMGCWGMCI YAFSAAFYSAIEiEKLEEFLS VRTLYFIAYLA 
roLGTGLATLSRNLYVVLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKFAGSSADGTRRGMGVD I SLLS CQ YFLAQ I LVSLVLGPLTSA 
VGS ANG VMYFS SLVSFLGCLYSSLFV I YE I PPSDAADEEHRPLL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


" Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cyeteine, Dispart ic Acid, B» 
Glutamic Acid, P» Phenyl al an ine , G^Glycine, 
H«Histidine, Idsoleucine, K=Lysine, 
Lafceucine, M^Methionine, N«=Asparagine f 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, ToThreonine, v«Valine, 
W«Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 


694S 


2067 


179 


EGEDRGLPRTMGAALGTGTRIAPWPGRACGALPRWTPTAPAQGC " 
HSKPGPARPVPLKKRGYDVTRNPHLNKGMAFTLBERtiQLGIHGL 
IPPCFLSQDVQliLRlMRYYERQQSDLDKYI I LMTLQDRNEKLFY 
RVLTSDVEKPMPI VYT PTVGLACQHYGLTFRRPRGLFIT IHDKG 
HLATMLNS Wp EDNI KAVWTDGERI LGLGDLGC YGMGI P VG KLA 
LYTACGGVNPQQCLPVLLDVGTNNEELLRDPLyiGLKHQRVHGK 
A YDDLLDE FMQAVTDKFG I NCL IQFEDFANANAFRLLNKYRNKY 
C^IFNDDIQGTASVAVAGILAALRITKNKLSNHVFGFQGAGEAAM 
G\IAHLLVMALE\KEGVPKA\EATRKIW\MVDF\KGLiIVQGRDH 
LNHEKEMFAQD\HPEVNSIiEEWRLVKPTAl IGVAAIAEA\FTE 
QILRDMASFHERP\IIFALSNPTSKAECTA\EKCYRVTBGPRGF 
FAS \ GS PF*G VL I WEMGKTFI PGGRGNNA*RVPRG WQLGVHSPG 
GDPGHIP\DEIFLPDSRAKLPQEVSEQHLSQGRLYP\PLST\IR 
NVFLRIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDS Y I WAQGKAMNVQTV 


6946 


133 


2551 


SCEYSGITVAPGDPCPGVAHLLAPSMASDTPESLMALCTDFCIiR - 

NLDGTIX3YLLDKETLRLHPD I FLPS E I \CDRLVNE Y VELVNAAC 

NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RBD\LVQD\QD\LE 

AIRKQDL\VBL\YLlW\CEKLSAKSLQTT^FSHTLGVP*AFro 

C\TNILLLRKENPGGL/CEDEYLFNPTCQVLVKDFTFEGFSRLR 

F\LKLGRMIDWVPVES\LLRPLNSLAALDLSGIQTSDAA\FLTQ 

WKDSL\VSLVL\YNMDLSDDHIR\VIVQLHKLRmDISRDRLSS 

YYKFKLTREVLSLFVQKLGNLMSLDI SG\HMILENCSIS KIGKR 

BAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRIiTH 

I PAY KVS GDKNE EQ VLNAI EAYTEHR P EITSRAINLLFD IAR I E 

RCNQLIjRALKLVITAIiKCHKYDRNI QVTGSAALF YLTNSE yrs e 

QS VKLRRQVIQ VVLNGMES YQE VTVQRNCCLTLC2JFS IPEBLEF 

QYRRVNELLLS I LN PTRQDES I QR I AVHLCNALVCQVDNDHKEA 

VGKMGFWTMLKLTQKKLLDKTCDQVMEFSW\SALWNITDETPD 

NCEMFLNFNGMKLFLDCI^EFPEKQELHRNMLGLLGNVAEVKEI* 

RP^LMTSQFISVFSNLLESKADGIBVSYNAOGVLSHIMFDGPEA 

WGVCEPQREEVEERMWAAIQSWD1NSRRNINYRSFEPILRLLPQ 

GISP VSQHWATWALYNLVS VYPDKYCPLLIKEGGMPLLRDI I KM 

ATARQETKEMARKVIEHCSNFKEENMDTSR 


6947 


2 


1682 


TSVSTI PRGliASARPQSRSWRCCPVWRRSPGRARGRGLKMl/NVP " 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 
DELMRAAGSDGTELFDQVHRWVNYESML KECLVGRMAI KPAVLK 

dyrbeekkvlngmlpksqvtdtiiakegpsypsydwfqtdslvti 
/ehiy*tegyqfrlnns*ssb*flysrnny*gli*isytyw/r*a 
mrfrki flogl/cesvgkiei vlqkkentswdflghplknhnsl 

IPRKDTGLYYRKCQLISKEDVTHDTRLFCLMLPPSTHLQVPIGQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 
I YPTGLFTPELDRLQ IGDFVS VS S PEGNFKI S KFQELEDLFLLA 
AGTGFTPMVKILNYALTDlPSLRKVKI^FFNKTEDDTTWPQrtT v 
KXxAFKDKRLDVBFVIiSAP I SEWNGKQGHISPALLSBFtiKRNLDK 
SKVLVCICGPVPFTEQGVRLliHDLNFSKNEIHSFTA 


6948 


104 


58 


PDGAHSFFPDEYFTCSSLCLSCGVGCJkKSMNHGKEGVPHBAKSR" 

CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWMGLAKY 

AWSGYVIECPNCGWYRSRQYWFGNQDPVDTWRTEIVHVWPGT 

DGFIiKDNNNAAQRLLDGMNFMAQSVSELSLGPTKAVTSWLTDQI 

APAYWRPNSQILSCNKCATSFKDNDTKHHCRACGEGFCDSCSSK 

TR PVPERG WGPAPVRVCDNCYEAR/TRPVS CYRGTSGR * RRRRT 

QETVB 


6949 


152 


46S6 1 


GLRLCLSRPLTRPGDDS VGGS AMASGAGGVGGGGGGKI R^TRRCH " 
QGPIKPYQQGRQQHQGILSRVTBSVKNIVPGWLQRYFNKNEDVC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide"*" 
(A-Alanine, CCysteine, D«Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K^Lysine, 
Ij«Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine / 
S=Serine, T«Threonine, V» Valine, 
W=Tryptophan, Y»Tyrosine, X- Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








SCSTDTSEVPRWPBNKBDHLVyADEESSNlTDGRITPEPAVSNT 

BEPSTTSTAST\yPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 

SAPPIGSSGFSLVKEIKDSTSQHDDDNISTTSGPSSRASDKDIT 

VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 

SLGNSSILKTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPYQA 

PVRRQMKAKQLSAQS YG VTSSTARRI LQSLEKMSS PLADAKR I P 

S I VS S PLNSPLDRSGI D I TD FQAKRE KVDSQYPPVQRLMTP KPV 

S I ATNRS VYFKPS LTPSGEFRKTNQR I DKKCS TGYEKNMTPGQN 

REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 

LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 

QALTNKVQMTSPSSTGSPMFKPSSPIVKSTEANVLPPSSIGFTF 

SVPVAKTABLSGSSSTLEPI ISSSAHHVTTVNSTNCKKTP PEDC 

EGPFRPAEILKEGSVLDILKSPGPASPKIDaVAAQPTATSPWY 

TRPAISSFSSSGIGFGESLKAGSSWQCDTCLLQNKVTDNKCIAC 

QAAKLSPRDTAKQTGIETPNXSGKTTLSASGTGPGDKFKPVIGT 

WDCDTCLVQNKPEAI KCVACETPKPGTCVKRALTLTvVsESAET 

MTAS S S S CTVTTGTLGFGDKFKRP I GS WECS VCC VSNNAEDNKC 

VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 

ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 

SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP 

MSEGF*FSKHIVGFKFGVSSESKPEEVKKDSKNDNFKFGLSFGL 

SNPVFLTPFQFGVSNLGQEB1CKEELLKSSCAGFRPGTCVINSTR 

VPANT I VTSENKSS FNLGT I ETKS VS VAPLKCQTS EAKKEEMPA 

TKGG FS FGNVE PASLPS AS VFVLGRTEEKQQE P VTSTSLV FGEG 

KI*TMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 

SEQPAKATFAFGAQTNTTADQGAAKPDLSYLNNSSSSSSTPATS 

AGGG \ I FGSSTSS S NPP VATFVFGQS SNPGS SS \AFGNTAES S T 

SQSLLFSQDSKLATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 

ATTTS S S AGSSFVFGTGPSA PSAS PAFGANQTPTFGQS QG AS Q P 

NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 

SAFGSGTTPNSSSAFQFGSSTTNFNFTNNSPSGVFTFGANSSTP 

AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRK 


6950 
6951 


2585 


411 


PRPGSRSGLCkRAUER^RAGGLSRRTRAK^IMDEtiHYQDTDS " 

DVPEQRDSKCKVKWTHEEDEQLRALVRQFGQQDWKF1ASHFPNR 

TDQQCQYRWLRVLNPDIfVKGPWTKEEDQKVlELVKKYGTKQWTL 

IAKHLKGRLGKQC31ERWHNHLNPEVKKSCWTEEEDRI ICEAHKV 

LGNRWAEIAKMLPGRTDNAVKNHWNSTIKRKVDTGGFLSESKDC 

A.r*v x ijlfij^LiEDKDGIiQSAQPTEGQGSIiLTNWPSVPPTIKEEEN 

SEEELAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 

ETSLPYK^TVVEAANLLIPAVGSSLSEALDLIESDPDAWCDLSKF 

DLPEEPSAEDSINNSLVQLQASHQQQVLPPRQPSA\LVPSVTBY 

RLDGHTISDLSRSSRGELIPISPSTEVGGSG1GTPPSVLKRQRK 

RRVALSPVTENSTSLSFLDSGNSLTPKSTPVKTLPFSPSQFLNF 

WNKQDTLEI^SPSLTSTPVCSQKVVVTTPLHRDKrPIiHQKHAAF 

VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLSEDLKE 

VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 

DEl^KLMMSTLPKSLSLPTTAPSNSSSIiTItSGIKEDNSLLNQGF 

LQAKPEKAAVAQKPRSHFTTPAPMSSAWKTVACGGTRDQLFMQE 

KARQLLGRLKPSHTSRTLILS 




1940 


229 


AGPDDTMKRSLQAL Y CQLLS FIjL iLALTEAIiAFAIQE PSPRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ*PPPII,ICAP/SSTGPAPAAMAT 
TSSKPEGRPRGQAAPTILLTKPPGATSRPTTAPPRTTTRRPPRP 
PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQFCRP 
LGKIFQIYKGNFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
1 nucleotide 

location 

corresponding 

to first 
1 amino acid 

residue of 

amino acid 

sequence 


Ammo acid segmenc containing signal peptide 
(Alanine, CoCysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=*Isoleucine , K=Lysine, 
LaLeucine. M«Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R*Arginine, 

S = Serine . TuTHy^ftrH no ir~Tf-ii 

W=Tryptophan, Y-Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\wposoible nucl&ot*iHo incAr»-i ,-.»-. > 








TVPSNTS WAP TTTSIjGPAKDKPGLRRAAQGGGSTFTS QGGTPDA 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 
S RPI*STS SGVFTAATGPTPAAFDTS VSAPSQGI PQGAS TTPQAP 
THPSRVSESTISGAKBETVA\PSP*PTGCPVLSPQWYPQPQAIS 
STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 
WPSACPSPP\LCPADGVLHEBEEEDRQPGEQPEAYGNNTHHPGT 
TFQQAC \RGAAPGE I P VPLKPLRTQLSEPRSPANGD YRDTGMVP 
C 


6952 
6953 


658 


1 304 


PESEGESGEMTDRYTIHSQLBHljQSKYIGT\ATPtf]PPSG&G\CE 
PTPRLVLLLHGPLRPSQLLRHCGE* EQSASPLQLDGKDASALWT 
ASRQARGBLRLCLTTAVRGTSPS VS PVCQSS 


6954 


1512 


349 


NWG KTRALASGKHVPFGKQTNPNKS / VHCDS * G* * RRETTQDES 
FSPHFRGKMGGW\KLEKBLENTEQPVGGNEG*EHEVTGNLNSD 
PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 
GRALSSPGSLGRHLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 
BVLNMESLP T VHNEG PS SAEGKD I AFS P P VYPAGILLVCNNCAA 
YRKLLEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 
ETPEERBVRRMRDREAKRLQRMQETDEQRARRLQRDREAMRLKR 
AIETPEKRQARLIREREAKRLKRRLEKMDMMLRAQPGQDPSAMA 
ALAAEMNFFQLPVSGVELDSQLLGKMAFEEQNSSSLH 


6955 


819 


1 


PPPPFIIPSHPREAGT*AG*KRSGDSECSPPVEQ*A*TRAAAQN 

* PQR* R WTEGNS PQASAVATPGQGASPAAPRCTP* PSRRHRRLP 
PGARPPAG* AAPAPTKPWLAGPASA PQPGAAPLS P PAPPLI RTR 

* CAGAAARGR PRRDRS PR PRTPGGCS WSEPRTPPAVSASAQTPS 
DAG * AGGR*GQRQRPS TGR* PPGVGGAGRSHRREGTI PGNPHPR 
Ai, RAGWQR* PGP/REWGL+EPQGEEMSGPGGPGGAPPNQVGSS 
VMQAMSTGI 




i96"8 


782 


PFC5KRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRTMPFLGQD 
WRS PGWS WI KTEDGWKRCES CS Q KLERENNHCN I SHS 1 1 LNS ED 
GE I FNNEEHE YAS KKRKKDHFRNDTNTQS FYREKWI YVHKE STK 
ERHGYCTIjGEAFNRLDFSSAIQD IRR FNYWKLLQLIAKSQLTS 
LSGVAQKNYFNILDKIVQKVLDDHHNPRLIKDULQDLSSTLCir. 
/N* RSRE VCI SG KHQ YLDL P I RNYSRLATTATGSS DD * AS E \NG 
LTLSDLPLHMLNNILYRFSDGWDIITLGQVTPTLYMZiSEDRQLW 
KKLCQYHFAEKQFCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 

GDTLHFCRHCSILFWKDSGHPCTAADPDSCFTPVSPQHFIDLFK 
F 


6956 


8605 


3839 


QTSTS I FASPTS PPVLGEi»VLQDNSFDLNNGSDAEQEEMETQSS 

DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 

PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 

TS PKAS PVTS PAAAFPTAS PANKD VS S FLETTADVEE ITGEGLT 

ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 

YYGPCGKRMKQFPEVIKYLSRNWHSVRREHFSFSPRMPVGDFF 

EERDTPEGLQW VQLSAEEI PSRI QAI TGKRGRPRNTE KARTKE V 

P KVKRGRGRP PKVKI TELLNKTDNR PLKKLEAQETLNEEDXAK I 

AKSKKKMRQKVQRGECQTTIQGQARNKRKQETKSLKQKEAKKKS 

KAEKEKGKTKQEKLKEKVKREKKEKVKMKEKEEVTKAKPACKAD 

KTIATQRRLEERQRQQMILEBMKKPTEDMCLTDHQPLPDFSRVP 

GLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 

L CQGDSLGE VQDLLVRLLKAALHD PGFPS YCQSLKILGEKVS E I 

PLTRDNVSEILRCFLMAYGVEPALCDRIiRTOPFQAQPPQQKAAV 

LAFLVHELNGSTLI INEIDKTLESMSS YRKNKWI VEGRLRRLKT 

VLAKRTGRS EVEMEG PE E C LGRRRS SR 1MEVTSGMEEEEEEES I 

ftAVPGRRGRRDGEVDATAS S I PBLERQ IEKLS KRQL FFRKKLLH 

9SQMLRAVSLGQDRYRRRYMVLPYIAGlFVEGTEGNr,VPEEVIK 

KETDSLKVAAHASLNPALFSMKMELAGSNTTASS PARARGRPRK 
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sta" 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«=Alanine, C« Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G*=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, NoAsparagine , 
P»Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
WoTryptophan, Y»Tyrosine, X«Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKPGSMQPRHLKSPVRGQDSEQPQAQLQPEAQLHAPAQPQPQLQ 
LQLQSHKGFLEQEGSPLSLGQSQHD1USQSAFLSWLSQTQSHSSL 
LSSSVLTPDSSPGKLDPAPSQPPEEPBPDEAESSPDPQALWFNI 
S AQMPCNAAPTPP PAVSE DQPTPSPQQLASS KPMNRP S AANPCS 
PVQFSSTPIaAGLAPKRRAGDPGEMPQS PTGLGQPKRRGRPPS KF 
F KQMEQ R YL TQLTAQ P VP PEMCSGWWWI RD PE MLD AMLKALH PR 
G I REKALHKHLNKHRDPLQEVCLRPSADP IFEPRQLPAFQEGIM 
SWSPKEKTYETDLAVLQWVEBLEQRVIMSDW2IRGWTCPSPDST 
REDLAYCTHLSDSQEDITWRGRGREGLAPQRKTTNPLDLAVMRL 
AALEQNVERRYLRE PLWPTHEVVLEKALLSTPNGAPEGTTTE I S 
YEITPR IRVWRQTLERCRSAAQVCLCLGQLBRS IAWEKSVNKVT 
CLVCRKGDNDEPT,LLCDGCDRGCHIYCHRPKMEAVPEGDWFCTV 
GLAQQVEGEPTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVLLR 
GRE S PAAGPR YS EEGLS PS KRRRLSMRNHHSDLTFCE 1 1 LMEME 
SHDAAWPFIiEPVNPRLVSG YRR 1 1 KNPMDFS TMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 
YQGKQGQSVRQGRWGVTLWHLPPTFQTKTCHFHLLMLPWVQTQV 
RYNPDF 


6957 


82 


3 514 


HL I VAMPEPTKKE ENS VPAPAPP PEEPSKEKEAGTTPAKD WTLV 
ETPPGEEQAKQNANSQLSILFIEKPQGGTVKVGEDITFIAKVKA 
EDLS EKPTINGS R KWMDLASKAGKHLQLKETFERHSRVYT FEMQ 
1 1 KAKDNFAGNYRCEVTYKDKFDSCSFDLEVHESTGTTPNIDIR 
SAFKRSGEGCEDAGELDFSGLLKRREVKQQEEEPOVDVWELLKN 
TKP S E YEKIAFQ YESPTCSGML KRLKRS I REEKKSAAFAKI LD P 
VYQVDKGGRVRFWELADPKIiEVKWNKNGQELRPSTKYI FEDTR 
CQS I I^IDNCQMTDDSE YYVTAGDEKCSTEIJjVREPPIMVTKQL 
EDTTD YCGER VEI»ECE VS EDDAQVKWFKNGBE I ILVQTR YRIRV 
EGKKHILIIEGATKADAADYSVMTTGGQSSAKLSVDLKPLKILT 
PLTDQTVNLGKEI CLKCE I SENIPGKWTKNGLPVQESDRLKWH 
KGRIHKLVIDHALTEDEGDYVFAPDAYNVTLPAKVHVIDPPKI I 
LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 
R KK KQ S S RWMRLNFDL CKE TT FE PKKM I EGVAYE VR I FAVN A\ I 
GISKPSMPSRPFVPIAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGIJX5YVLEYCFEGS TSAKQSDENGEAAYDIiPAEDWl VANKD 
LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPIIjVKE 

iieppkihspkhlkqty1rrvgdrvilvipfqgkprpeltwkkd 
gaeidknqinirwsetdtiifirkaershsgkydlqvkvdkfve 
tas id i r i idrpg p pqivki edvwgrnvaltwtpp kddgnaai t 
gytiqkadkksmewlrviehiiepvphtelvigneyyfrvfsen 
mcglsedatmtkesaviardgkiyknpvyedfdfseapmftqpl 
vnrlchsgymatlncsvrgnpkpkitwmknkvaivddpryrmfs 
nqgvctlei rkpspydggtycckavndlgtvei ecklevkviao 


6958 " 


274 


1663 


PRTSRVKTEGSOGSSAMDpqvin/nTElfP\rrrPTrT t tbdt cf — 
DCGHSFCQACITAKIKESVI ISRGESS CPVOQTRFQPGNLRPNR 
HLANIVERVKEVKMSPQEGQKRDVCEHHGKKLQIFCKEDGKVIC 
WVCELSQEHQGHQT FRINE WKECXJEKIiQVALQRL I KENQEAE K 
LEDDIRQERTAWKNYIQIERQKILKGFNEMRVILDNEEQRELQK 
LEEGE VNVLDNLAAATDQ L VQQRQDAS TLI S DliQRRLRGS S VEM 
LQDVI D VMKRSESWTLKK PKS VS KKLKS VFR VPDLSGMLQVLKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVTCTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSIiNK 
RKS SGFAFDPSVNYS KVYS R YRPQ YGYWVIGLQNTCE YNAFEDS 
SSSDPKVLTLFMAV\LPWLGFS 


6959 


1 


1469 


SLiVHWEFGRGIBDFPYIiFFQLTHOQQR I CS VTQAG VQWCDHSS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N«Asparagine, 
P=Proline, Q-Glutamine, RaArginine, 
S=Serxne, T=Threonine , V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQPQTPGI^IQSSHLSLLSSRDYRMLSSFNEWFWQDRFWIjPPJNrVT 

VTTBIjEDRIXSRVYPHPQDIJAALPLALVLLAMRIiAFERFIG 

RWLGVRDQTRRQVKPMATLEKHFLTEGHRPKEPQLSLLAAQCGL 

TLQQTQRWFRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGLSV 

LYHESWLWAPVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLBLG 

FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 

HFVAVILMTFSYSANLLRIGSLVIjLLHDSSDYLLEACKMVNYMQ 

YQQ VCEIALFLI FS FVFFYTRLVLFPTQI I»YTTYYES I SNRGPFF 

G YY FFNGIjIjMLIjQLLHVFWSCLI LRMIiYS FMKKGQMEKD IRS DV 

EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 

RHTTAT 


6960 


387 


2068 


AKWAREKEMQEF\TRSFF\RGRPDLSTLTHSIVRRRYLAHSGRS 
HLEPEEKQALKRLVEEBPLKMQVDEAASRBDKLDLTKKGKRPPT 
PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTKP 
KE ENPRRA \ S KAVEE S S DEERQRDLPAQRGE ES S E E EEKG YKG K 
TRKKP WKKQAPGKAS VSRKQAREESEES EAEP VQRTAKKVEGN . 
KGTKSLKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 
PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 
DDS GRDRE P P VQ RKS EDRTQLKGGKRLSGS S ED EE DSG KG E PTA 
KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 
SRKGRTRS S S SSSDGSPEAKGGKAGSGRRGE DHPAVMRLKRY IR 
ACGAHRNYKKLLGS CCS HKERLS ILRAELEALGMKGTPSLGKCR 
ALKEQREEAAEVASLDVANI ISGSGRPRRRTAWNPLGEAAPPGB 
LYRRTLDSDEERPRPAPPDWSHMRGX ISSDGESN 


6961 


340 


1646 


RPWSSPTMKPNFSLRLRIFNLNCWGIPYLSKHRADkMfefetGDFL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGI IGSG 
LCVFS KHPI QELTQHI YTLNGYPYMI HHGD W FSGKAVGLLVtiHL 
S GMVLNAYVTHLHAE YNR Q KDI YLAHRVAQAWELAQ F IKHTS KK 
ADWLLCGDLNMHPEDLGCCLLKEWTGIjHDAYLETRDFKGSEEG 
NTMVP KNCYVSQQELKPFPFGVR I D YVLYKAVSGFYI SCKS FBT 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\IiL 
LALLCVLAAGGGAGEAAI LLWTPSVGLVLWAGAFYLFHVQEVNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKBQ 


6962 


340 


1646 


RPWSSPTMKPNFSLRIiRIFNLNCWGIPYLSKHRADRMRRLGDFL 
NQES FDLALLEE VWSEQDFQ YLRQKLS PT YPAAHHFRS G I IGSG 
LCVFSKHPIQEIiTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAE YNRQKDI YLAH RVAQAVJELAQ F I HHT S KK 
AD WLLCGDLNMHP EDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQEIiKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLS DHEALMATLFVRHS P PQQN PS STHGP \AERS 
PL/ MCVCLKEALDGSLGLGMA\QARWWA\TFA\S YVIGLG L\ LL 
LALLCVLAAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVQEVNG 
LYRAQ AELQHVLGRAREAQDLG PE PQL YALL\ LG QQEGDRTKEQ 


6963 


374 


2618 


RATTPIi IUCLLKKP KTABNQKASEENE ITQPGGSSAKPGLPCLNF 
EAVLSPDPALIHSTHSLTNSHAHTGS SDCDIS CKGMTERIHS IN 
l^NFSNSVI^TI^EQI^GHFCDVTVRIHGSMIJU^RCVLAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQ I LTAAS ILQI KTV I DECTRI VS QNVGDVFPG IQDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQQMERYL 
STTPETTHCRKQPRPVRIQTLVGNIHIKQEMEDDYDYYGQQRVQ 
ILERNES EECTKDl'DQ AEGTES EPKGES FDSGVS S S IGTEPDS V 
EQQFGPGAARDSQAEPTQPEQAABAPAEGGPQTNQLETGASSPE 
RSNE VEMDS TVI TVSNSSDKS VLQQPS VNTSIGQ PL PS TQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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SEQ 
ID 
NO: 


Predictecl 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(JUAlanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F* Phenylalanine, G*Glycine, 
H=Histidine, Ialeoleucine, KsLysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T«Threonine, V* Valine, 
W-Tryptophan, Y-Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\ a possible nucleotide insertion) 








PPLFSLPQPLAGQQTQFVTVSQPGLSTFTAQLPAPQPLASSAGH 
STASGQGEKKP YECTLCNKT FTAKQNYVKHMFVHTGEKPHQ CS I 
CWRS FSLKDYL I K\HMVTHTGVRAYQCS ICNKRFTQKSSLNVHM 
RLHRGEKS YBC Y I C KKKFSHKTLLERHVALHSASNGTP PAGTPP 
GARAGPPGWACTEGTTYVCS VCPAKFDQI EQFNDHMRMHVSDG 


6964 


1 


178 ' 


SGRP FFFFFSNTDVYF I KKVTNRWTAGSSYKMTRMKS IGKI LLL 
QIFIG\NCSMFVLVI 


"6965 


757 


208 


NVFI EPR IQGFM KTS AHPGQKH PDFSMGLLFPLLAAIJ5 VCS CGS 
SGSIiGYNLPQNH\GLLGRNTIiVIiLGQMRRISPFLCLKDRSDFRF 
PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWVXEGSTLALRRY 
FOES IS TLB 


6966 


820 


1867 


IITALGVRGMPGCPCPGCGMAGPRLLFLTALALELLGRAGGSQP 
ALRSRGTATACRLDNKESESWGALLSGERLDTWICSLLGSIiMVG 
LSGVFPLLVI PLEMGTMLRS EAGAWRIiKQLLS FALGGLLGNVFL 
HLLPEAWAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 
GP\D1KVSGYLNLLANTIDNFTHGLAVAASFLVSKKIGLLTTMA 
ILLIIE I PHEVGDFAILLRAGFDnWSAAKLQLSTAliGGLLGAGFA 
ICTQSPKGVBETAAWVLPFTSGGFI,YIAIjVNVLPDI»LEEEDPW 


6967 


162 


633 


GFLPFKYWILDLSASSRMETDCNPMELSSMSGFBEGSEIjNGFEG 

tdmkdmrleaeawndvlfavnnm fvs kslrcaddvayi nvetk 
erijryclelteaglkwgyafdqvddhlqtpyhetvyslldtl\ 
s payreafgkr \ llqrlealkrdgqs 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQK~ 
L E Q FHLSSMS S LGGP AAFSARWAQEAYKKESAKE AGAAAVPAP V 
PAATEPPPVLHLPAIQPP PPVL PGP FFMPSDRS TERCET VLEGE 
TI SCFWGGEKRLCIiPOI I*NS VLRD FSLQQINAVCDEIiH I YCSR 
CTADQLEILKVMGI LPFSAPSCGLI TKTDAERLCNALLiYGGAYP 
PPCKKBLAASLALGLELSERSVRVYHE\CFGKCXGIj\LVPELYS 

spsaacio^ld\crlmypphkfvvhshkalbnrtchwgf\dsa\ 
nwrayillsqdytgkeeqarlgr\clddvkekfdygnkykrrvp 
rvsseppasirpktddtssqspapsekdkpsswlrtlagssnks 
lgcvhprqrlsafrpwspavsasbkelsphlpalirdsfysyks 
fetavapnvaijvppao^kwssppcaaavsrapepiatctqprk 
rkltvdtpgapetlapvaapeedkdseaevevesreeftsslss 
lsspsftssssakdlgspgaralpsavpdaaapadapsgleael 
ehlrqaleggldtxeakekflhevvkmrvkqeeki^aalqakrs 

LHQE ZjEFLRVAKKEKLREATEAKRKLRKE IERXjRAENEKKMIQSA 

nesrlrlkreleqarqarvcdkgceagrlrakysaqiedlqvki. 
qhaeadreqlradllrerearehlek\wk\elqeqlwprarpe 
aagseg\aaelep 


m 6969 


1855 


118 


AGTMHGRIiKVKTS E EOJ\EAKRLEREQKIjKL yqs atqavfqkrqa 

geldesvleltsqilganpdfatlkncrrevlqqletqkspeel 
aalvkaeixsflesclrvnpksygtvihhrcwllgrlpepnwtrel 

ELCARFLEVDERNFHCWDYRRFVATQAAVPPAEEIiAFTDSLITR 

nfsnysswhyrscllpqlhpqpdsgpqgrlpedvllkelelvqn 
afftdpndqsawfyhrwllgradpqdalrclhvsrdeacltvsf 
sr pllvgsrme i lllwvdds plivewrtpdgrnrpshvwlcdlp 
aaslndqlpqhtfrviwtagdvqkecvllkgrqegwcrdsttde 
qlfrcelsvekstvlqseiiesckelqelepenkwcl\ltiillm 
raldpllyeketlqyfqtlk\awdpkraty\lddlrskfllens 
vlkmeyaevrvlhi^kdltvi^leqll^vthldlshnrlrtl 
p palaalrcledppprt\ vlqasdnai esldg vtnlprlqelll 
cnkrlqqpavlq pliascprlvllnlqgnplcqavg ileqlaell 
psvssvlt 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=» Alanine, CaCyeteine, DsAspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHistidine, I«Isoleucine, KsLysine, 
LaLeucine, M=Methionine , NnAsparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
SaSerine, T»Threonine, V»Valine, 
WaTryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


SFPPIiI^SPSAVGBGiCVAVAAPCI>GRSE(MAKMAyiQLBPrJ!7E 
GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEDEVEILGPFPA 
QT P P WLMAS RSSDKDGDS VHTASE V PLTPRTNS PDGRRSSS DTS 
KSTYSLTRRISSLBSRRPSSPLIDIKPIEFGVLSAKKEPIQPSV 
LRRTYNPDDYFRKFEPHLYSLDSWSDDVDSLTDEEILSKYQLGM 
LHFS TQYDLLHNHLTVRVI EARDLPPPISHDGSRQDMAHSNPY V 
KICLLPDQKNSKQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLL 
LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 
VELGELLLSLNYLPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELEHASLV 
FTVFGHNMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVSPASLEVT 


*971 


37 


3702 


ACF YVPGSRS FKIil PRHGLVNMGRSG KLPSG VS AKLKR WKKGHS 
SDSNPAI CRHRQAARSR FFSRPSGRSDLTVDAVKLHNBLQSGSIj 
RLGKSEAPETPMEEEAELVLTEKSSGTFLSGLS DCTNVTFSKVQ 
RFWESNSAAHKEICAVLAAVTEVIRSC3GGKETETEYFAALIRKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTLHMLTLLKDLL P CFPEGLVKS CS ETLLRVMTLS HVLVTA 
CAMQAFHS LFHARPGLSTLSAELNAQ 1 1 TALYD YVPS ENDLQPL 
LAWLKVMEKAHIMLVRLQWDLGLGHLPRFFGTAVTCLLSPHSQV 
LTAATQSIjKEILKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 
EEGLTYKFHAAWSSVLQLtCVFFEACGRQAHPVMRKCLQSLCDL 

RLS phfphtaaldqavgaavtsmgpewlqavplei dgsbetld 

FPRS WIjLPVI RDHVQETRLG FFTTY FL PLANT L»KS KAMDLAQAG 
STVESKI YDTLQWQMWTLLPGFCTRPTDVAI S FKGLARTLGMAI 
SERPDLR VTAf COJULRTL ITKGCQAEADRAEVSRFAKNFLP I LFN 
LYGQ PVAAGDTPAPRRAVLETIRTYLTITDTQLVNSLLBKAS EK 
VLDPASSDFTRLSVLDLWALAPCADEAAISKLYSTIRPYLESK 
AHGVQKKAYRVLEEVCAS PQGPGALFVQSHLEDLKKTLLDSLRS 
rSSPAKRPRLKCLLHIVRKLSABHKEFITALIPEVILCTKEVSV 
GARKNAFALLVEMGHAFLRFGSNQEEALQCYLVLIYPGLVGAVT 
MVSCSILALTHUjFEFKGLMGTSTVEQLLENVCLLLASRTRDW 
ICS ALG F I KVAVTVMDVAHIiAKHVQLVMEAIGKLSDDMRRHFRMK 
LRNLFT\KFIPK\FGILTWGKKAVGPKEYHRVLVNIRKABARAK 
RHRALSQAAVEEEEEEEEEEEPAQGKGDSIBBILADSEDBEDNE 

eeersrgkeqrklarqrs rawlkegggde plnfld pkvaqrvla 
tqpgpgrgrkkdhsfkvsadgrliireeadgnkmeeeegakged 
eemadpmedviirnkkhqklkhqkeaeeeeleippqyqaggsgi 
hrpvakkampgabykakkakgdvkkkgrpdpyayiplnrsklnr 

RKKMKLO^QFKGLVKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAI llplwrrtrpreatvprgaaqrgrarsaegri PSSQS PS 
PAEAGGATRS PP PRP PR PARP PGPS APPLLRSDAG PGATVS AAA 
AAATERARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
AITX/ESPDIKYPLRIjIDREIISHDTRRFRFALPSPQHILGLPVG 
QH I YLSARIDGNLWRPYTPI SSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLBSMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
S N P 1 1 RTVKSVGM I AGGTG ITPMLQVI RAIMKDPDDHTVCHLL F 
ANQTEKDILLRPELEELRNKHSARFKLWYTLDRAPEAWDYGQG\ 
FVNEEMIRDHLPPPE\BEPLVLMCGPPPMIQYACLPNL\DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGI^QKCGRPAPGVIWIVI^PVIGKLLHK^VVLASA 
S PRRQEILSNAGLRFE WPS KFKEKLDKASFATP YG YAMBTAKQ 
KALEVANRLYQKDIiRAPDVVIGADTIVTVGGLILEKPVDKQDAY 
RMLSRFE/SGREHSVFTGVAIVHCSSKDHQLDTRVSBFYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A-Alanine, C« Cysteine, D=*Aspartic Acid, E* . 
Glutamic Acid, F= Phenylalanine. G=Glycine, 
Ht=Histidine, I»lsoleucine, K=Lysine, 
LaLeucine, M=Methionine, N**Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine , V«Valine, 
WoTryptophan, Y«Tyrosine, XoUnknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDVEGGGSEPTORDAGSRDEKAEAGEAGQArAEAECHRTRETLP 
PFPTRLLBLIEGFMLSKGLLTACKLKVFDLLKDEAPQKAADIAS 
KVDASACGM ERLLD I CAAMGLLEKTEQG YSNTETANVYLASDG E 
YSLHGFIMHNNDLTWNLFTYLE FAIREGTNQHHRALG KKAEDLF 
QDAY YQS P ETRXRFMRAMHGMTKLTACQVATAFNLSR FS S ACD V 
GGCTGALARELARE YPRMQVTVFDLPD I 1 ELAAHFQ PPG PQAVQ 
IHFAAGDFFRDPLPSAELYVLCRILHDWPDDKVHKLLSRVAESC 
KPGAGLLLVETLLDEEKRVAQRALMQSLNMLVQTEGKERSLGEY 
QCLLELHG FHQ VQ WHIiGGVLDAI L\ PPKWPPEAQAACSL 


4974 


3082 


2172 


RSC^FASFASRPPliELFAPPGSHRSPPGRGVATSAQCALSVRlT" 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSliPTSAPLSV 
S LPTNXVP PTTI WTS S PQNTDADTAS PSNGTHNNS VL PVTAS A P 
TSLLPKNIS I ESREE EITSPQSNWE3TNTDPSPSGFSSTSGGVH 
IjTTTLEEHSIiGTPEAGVAATLSQSAAEPPTLISPOAPASSPSSL 
STSPPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTSHATAEPVPQEKTPPTT vsgkvmcelidmet\ pp p fpg 


6975 - 


2 


500 


RPRPT^CCiCWALKL^TA^ETLIN^KAHSGKEGDKYKLSKKBL- 

kei*lqtelsgfldvkelml*ateai>ktfeea* kspi iqcsssrs 

SLPPAPQPPP YI,* LSAVPFP IHLPLPLLPPQAQKDVDAVDKVMK 
BLDEMGDGEVDFQE Y WLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL* YAYGTTENS P VTFAHFPE DTVEQKAE S VGR IMPHTEAR I 
MNMEAGTLAKLNTPGELCIRGYCVMLGYWGEPQKTEEAVDQDKW 
YWTGD VATMNEQG FCKI VGRSKDMI IRGGENI YPAELEDFFHTH 
PKVQEVGWG VKDDRMGEE I CAC I RLKtJGEETTVTrTr t v h. prifrv 

ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL*IKQQ 
ACPGRLA 


" 6977 


1298 


588 


SLFINTNLIjSNQIRKTSFGMCSEPISDNTKDQKGKIjKTPDFA*R 
ANKKS KHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VY5TQVGFAGGYTSNPTYKEVCSEKTGHABWRWYQPEHMSFE 
ELLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KENYQKVLSEHGFGPITTDIREGQTFYYAEDYHQQYLSKNPNGY 
CGLGGTGVS CPVGIKK 


" *978 


3 


242 


S FP FRDSRRCG CCKGSS LRHTAVAM VKLS KEAKQRIjQOLFKGS Q 
FAIRWGF IPLVI YLGFKRGADPGMPEPTVLSLIiWG 


6979 


3917 


1146 


DEAR VRGEAVAAAI LSRCRHWSGPPPFPPSPPDRKGLRG TEP WE 








RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSELEGT 
AQPPP PGLQ PHAEPGG YS GPDGH YAMDNITRQNQFYDTQVI KQE 
NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTS FLPPE ASQLKP 
DRQQ FQS RKR P YE ENRGRG YFEHREDRRGRS PQPPAEEDEDDFD 
DTLVAIDTYNCDLHFKVARDRSSGYPLTIEGFAYLWSGARASYG 
VRRGRVCFEMKINEE ISVKHLPSTEPDPHWR IGWSLDSCSTQL 
GEEPFSYGYGGTGKKSTNSRFENYGDKFAENDVIGCFADFECGN 
DVELS FTKNGKWMGIAFR IQKEALGGOALYPHVLVKNCAVEFNF 

H^D Z) t D VPOl/T DPIPTDTrtuT r*T fionTnnmmni/ovtt ns+n^* itm - 

uyM/U5«* XUoVJjJrUtf 1 r JtQliligliS flKJL RGTVGPKS KAECB I LMMV 
GLPAAGKTT WAI KHAASNPS KKYNILGTNAIMDKMRVMGLRRQR 
N YAGR WD VLI QQATQCLWRL IQIAARKKRNYI LDQTNVYGSAQR 
RKMRP FEGFQRKAIVICPTDEDLKDRTI KRTDEEGKDVPDHAVL 
EMKANFTIiPDVGDFLDEVLFIELQREEADKLVRQYNEEGRKAGP 
PPE KRFDNRGGGGFRGRGGGGGFQR YENRGP PGGNRGGFQNRGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 
PQQQPPPQQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNQGGYSQ 
G YTAP P P P P PP P PAYNYGS YGGYNPAP YT PPP PPTAQTY PQP S Y 
NQYQQYAQQWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTSTQ 


' 4980 


1 


420 


GTRGRKTGRVAAPSTRRRTGNMOKLQTRS PAMSLS DPGLGYHPT 
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SSQ 
xu 
NO: 


Predicted 

jjcyinniny 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
s e qu en.ce 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=»Aspartic Acid, B» 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
Ii=Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q*Glutamine, R=Arginxne, 
SaSerine, T=Threonine, V* Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








CWTLRWPPIiCSLhVUiHVFHCLFSSRLGTPVSPRLAMDPKfegeBA 
OGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


""6381 


10 


1054 


PGRGPRRASLRPAFAARGVFQGGLGQAKQARTRACAAIjPTPHPS 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCIjVSVPPT 
RQSLRKLKQRFPSQSQATLLS I FSQEYQKHIKRTHAKHHTSBAI 
E S YYQRYLNG VVKNGAAPVLLDIjANEVD YAPSLMARL I LERFLQ 
EHEETPPSKS I INSMIiRDPSQ I PDGVLANQVYQC I VNDCCYGPL 
VDCI KHAIGHEHEVLLRDLLLEKNLSFLDEDQLRAKG YDKTPDF 
ILQVPVAVEGHIIHW I ESKASFGDECSHHAYLHDQFWS YWNRFG 
PGLVIYWYGFIQBLDCNRERGILLKACFPTNIVTLCHSIA 


6982 


153 


128* 


FPQQDCSAPAAPGLAGSEPRRLRAYRRRRQRARGLKRVAWLAP P 
PSLLG^IiQ^WAO^PVDGTIiGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHPVIKAFLCGSISGTCSTLLFQPI^LLKTRLOTLQ 
PSDHGSRRVGMLAVLLKVVRTESLLGLWKGMSPSIVRCVPGVGI 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSP1TVI 
KTR YESG KYG YES 1 YAALRS I YHS EGHRGLFSGLTATLL RDAP F 
SGI YLM FYNQT KN I VPHDCVDATL I P 1TNFS CG I FAG 1 LASL VT 
QP ADVI KTHMQLYPLKFQWIGQAVTLI FKDYGLRGFFQGG I PRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


62 


773 


EMSFLQDPSFFTMGMWSIGAGALGAAALALLIiANTDVFLSKPQK 
AALEYIiEDIDIiKTLEKEPRTPKAKELWEKNGAVTMAVRRPGCFL 
CREEAADLSSLKSMLDQLGVPLYAWKEHIRTEVKDFQPYFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
G EG F I LGG VF WG SGKQG I LL EHREKE FGDKVNLLS VLEAAKM I 
KPQTLASEKK 


6984 


1B4S -"" 


1282 


GGRSAYSLPAGSLPRVPATAAAKMASGVQVADEVCRIFYDMKVR 
KCSTPEEIKKRKKAVIFCLSADK JCCI I VEEGKE I LVGDVGVTI T 
DPFKHFVGMIiPEKDCRYALYDASFKTKESRKEELMFFLWAPELA 
PLKSKMIYASSI03AIKKKFQGIKHECQANGPEDIjNRACIAEKIiG 
GSLIVAFEGCPV 


6985 


1887 


1324 


rrtagiypcTpkpgr-trhalgsvvlllltgqlafddfqescamm 

WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 
YKLAVEQLQSHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
IPVSGSKSEGLLYVHSSRGGPFQRWHIiDEVFLELKDGQQIPVFK 
LSGENGDEVKKE 


698* 


642 


1350 


YHLYFKMGDPNS RKKQALNRLRAQLRKKKE^LaDQFDFKMY IAF 
VFKSKKKKSAL F EVS E VI PVMTNNYEEN I LKGVRDS SYSLESSL 
ELLQKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIEKIVC 
LLFSRW KESDE P FRP VQAXFE FHHGD YEKQFLHVLSRKDKTG I V 
VNNPNQS VFLF I DRQH LQT P KNKATI FKLCS ICLYLPQEQLTHW 
AVGTI EDHIiR PYMPE 


6987 


1623 


341 


IiEAAEKASRAFKESQRQTDS KNYETENWSPQKSQRRYDMYNTAC 
FLGE I EVGLYT I Q ILQLTPFFHKENELSKKHMVQFLSGKWTI P P 
DPRNBCYLALSKFTSHLKNLQSDLKRCFDFFILYMVLLKMRYTQ 
KEIABIMLSKiCVSRCFRKYTELFCHLDPCIiLQSKESQIiLQEENC 
RKKLEALRADRFAGLLE YLNPNYKDAT TM ES I VNE YAFLLQQNS 
KKPMTNE KQNS ILANI ILS CLKPNSKL IQPLTTLKKQIiREVLQ F 
VGLSHQYPGPYFLACZjLFWPENQELDQDSKLIEKYVSSLNRSFR 
GQYKRMCRSKQASTIjFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 
SLWHSGDVWKKNEVKDLLRRLTGQAEGJCLISVEYGTEEKIKIPV 
ISVYSGPLRSGRNIERVSFYLGFSIEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
L VRGLGAASTAA PQ DAQTG PO P M PRAD C I MRHLP YF CRGQ WRG 
FGRGSKQLGI PTANF PEQWDNLPAD I STG I YYGWAS VGSGDVH 
KMWSIGWNPYYKNTKKSMBTHIMHTFKEDFYGRILNVAIVGYL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I»Isoleucine, K-Lysine, 
Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X»UnJcnown, *»Stop 
Codon, /^possible nucleotide deletion, 

\ HHAaa l V\~\ n mini nA k • J»» t ~* ******** 1 ^ \ 

\BpossaDie nucxcociae insertion) 








RPEKNFDSLESLISAIQGDIBEAKKRLELPEHLKIKEDNFFQVS '" 
KSKIMNGH 


6989 


2 


1118 


L^PSDRPLSPSTHASAGSHCHAPPTTARRAFPiPFGSKSNMA'l'L 
lujytii i w Liij K LbUT PQN id TWG VGAVGMACA I S I LM KDLADE L 
ALVDVIEDKLKGEMMDLQHGS LPLRTPKI VSGKDYNVTANSKLV 
I ITAGARQQEGESRLNLVQRNVNIFKPIIPNWKySPNCKLLIV 
SNPVDILTWAWKISGFPKNRVIGSGCNLDSARFRYLMGBRLGV 
HPLSCHGWVLGEHGDSSVPVWS GMWAGVSLKTLHPDLGTDKDK 
EQWKEVHKQVVESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 
RVHP VSTM I KGL YG I KDD VFLS VPCI LGQNGI SDLVJCVTLTS EE 

T7&DT VlfOJlTVPT r.T/-i t r\ \rT> T rim 
O, AKJjKKiiAJJ 1 L»WCjIQKEIj(QF 


6990 


719 


258 


THASGMAS WLALRTRTAVTS LLS PTPATALAVRYAS KKSGGS S 
KNLGGKS SGRRQGI KKMEGHYVHAGNI I ATQRHFRWHPGAHVGV 
GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 
KTFVHVVPAKPEGTFKLVAML 


6991 


169 


451 


RRSS DFHNPGFLSR P VS LREN I liHQVI CSTKNKRRN PKK!£aT?EL 
S S LLMTNLN PNES TENQP VDAYWAFTLDQE FI/TYACVEGTGCLF 
CGRHVH 


'"§992 " 


944 


510 


RQAPGCSSLALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGALY " 
RS PMNQEN P PPYPGPGPTAPYP P YP PQPMGPGPMGGP YPP PQG Y 
PYQGYPQYGWQGGPQEPPKTTVYVVEDQRRDELGPSTCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


QWCVTCPQHNARQGPAVPPGiOAYGAAPFEDtQVDFTEMSkCRG 
DRVWIKNWOTASLCPLWKGPQTVVLSPPTAVKVEGI PAWIHHSH 
VKPAARET WEARPS PDNP FRVTLKKTTSPAP VTPGS 


*994 


346 


1100 


QWPEKDPVMAASSISSPWGKHVFKAILMVLVALILLHSAIiAQSR 
RDFAPPGQQKREAPVDVIiTQIGRSVRGTLDAWIGPETMHLVSES 
S5 QVLWAI SS AI SVAFFALSG I AAQLLMALGLAGD YLAQGLKLS 
PGQVQTFLLWGAGALWYWLLSLLLGLVLALLGRILWGLKLVIF 
LAGFVALMR3VPDPSTRALTtTiTiATJiILYALLSRLTGSRASGAQL 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 


6995 ~ 


144 


1346 


GS VAVGLSG I MAAQ KDL WDAI V IGAG IQG CFTAYHLAKHRKRI L 
LLEQ FFLPHSRGSSHGQSR XIRXAYLEDF YTRMMHECYQI WAQL 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPN IRIiPRGEVGLLDNSGG VI YAYKALRALQDAI RQLG 
G I VRMEKWE INPGLLVTVKTTSRS YQAKSLVITAGP WTNQLL 
RPLGI EMPLQTLRIN VCYWREMVPGS YGVSQAFPCFLWLGLCPH 
H I YGLPTGE YPGLMKVS YHHGNHADPEERDCPTARTDIGDVQI L 
SS FVRDHLPDLKPEPAVI ESCMYTNTPDBQFILDRHPKYDNIVI 

GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KAHIt 


699<J 


543 


1942 


ETANAEAAARKSAMDWKEVLRRRIjATPNTCPNKKKSEQELKDEE " 

MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQKLR 

EESRAVFLQRKSRELLDNEELQNLWFLLDKHQTPPMIGEEAMIN 

YENFLKVGBKAGAKCKQFFTAKVFAKLLHTOSYGRISIMQFFNY 

VMRKVWLHQTR IGLS LYD VAGQGYLRES DLENY I LELI PTLPQL 

DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACSFLDD 

LLELRDEELS KESQETNWFS APSALR VYGQYLNLDKDHNGMLS K 

EEIiS R YGTATMTNVFIJ)RVFQECLTYTOEMD YKTYIjDFVIiALEN 

RKEPAALQYIFKLLDIENKGYIiNVFSIiNYFFRAIQELMKIHGQD 

PVSFQDVKDEI FDMVKPKDPLKISLQDLINSNQGDTVTT1LIDL 

NGFWTYENRKALVANDSENSADLDDT 


6997 


370 


1104 


AMBLTIFILRLAIYILTFPIiYIiLNFIiGIjWSWICKKWFPYFLVRF " 

TVIYNEQMASKKRELFSNLQEFAGPSGKLSLLEVGCGTGANFKF 

YPPGCRVTCIDPNPNFBKFLIKSIAENRHWJFERFVVAAGENMH 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D«Aspartic Acid, E« 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, ^Threonine, V=Valine, 
W-Tryptophan, Y» Tyrosine, X=Unknown, *aStop 
Codon, /-possible nucleotide deletion, 
\=poasible nucleotide insertion) 








QVADGSVDVWCTLVLCSVKNQERILREVCRVLRPGGAFYFMBH 
VAAE CSTWNYFWQQ VLD P AWHLXi FDG CNLTRES WKALE RAS FS K 
LKLQHIQAPLSWELVRPHIYGYAVK 


699B 


2 


| 616 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIECPGATNCPEPLWC 
SHLP VP YAP PTMES RGKS AS S PKPDT KVPQVTTEAKVP PAADGK 
APLTKPSKKEAPAEKQQP PAAPTTAP AKKTS AKAD P ALLNNHSN 
LKPAPTVPSSPDATPEPXGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPLLVAGGVAVAAIALILGVAFXiVRKK 


6999 


14 


1591 


GRAGACSRRDTAMSIEIESSDVIRLIMQYLKENSLHRAIiATLQE 
ETTVSLNTVDS I ESFVADINSGHWDTVLQAIQSLKLPDKTLIOL 
YEQ WLE LI ELRELGAARS LLRQTDPM IMLKQTQPER Y IHLENL 
LARSYFDPREAYPDGSSKEKRRAAIAQAIAGBVSWPPSRLMAL 
LGQALKWQQHQGLLPPGMTIDIiFRGKAAVKDVEEEKFPTQLSRH 
IKPGQKSflVECARFSPDGQYLVTOSVDGFIEVWNFTTGKIRKDL 
KYQAQDNFMMMDDAVLCMCFSRDTEMLATGAQDGKI KVWKI QSG 
QCLRRFERAHSKGVTCIiSFSKDSSQILSASFDQTIRIHGLKSGK 
TLKE FRGHS S F VNEATFTQDGH Y 1 1 S AS S DGTVKI WNMKTTECS 
OTFKSLGSTAGTDITVNSVILLPKNPEHFWCNRSNTVVIMNMQ 
GQIVRSFSSGKREGGDFVCCALSPRGEWIYCVGEDFVLYCF6TV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWRP 


7000 


2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLLQPALTGDVEGLQKI FEDPENPHHEQAMQLLLEEDI VGRN 
LLYAACMAGQSDVI RALAKYGVNLNEKTTRGYTLLHCAAAWGRL 
ETLKALVELDVD I EALNFRE ERARDVAAR YSQT E CVEFLDWADA 
RLTLKKYIAKVSLAVTDTEKGSGKLLKEDKNTILSACRAKNEWL 
ETHTEAS INELFEQRQQLEDIVTPI FTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCIiIIAFLKGCFIFIYFIFIFETEFLSCCPGWSAVAQSRLIAN 
FASQVQAIFILPKDSQVGPDVKSEAAPKRAliYESVFGSGEICGP 
TSPKRLCIRPSEPVDAVWVSVKHDPLPLLPEANGHRSTNSPTI 
VSPAIVSPTQDSRPNMSRPIjITRSPASPLNNQGIPTPAQLTKSN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 
HYFIDRDGQMFRYILNFLRTSKLLIPDDFKDYTLLYEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGER1TLSGDK 
SLIEEVFPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFE I VGSCGGGVDSSQFSEYVIiRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


4 98 


PMPSSTRWTTS*TYTDTSSAWACRPTTGTCT*TAAPGPTVRWWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTS ACTS WPAGRRTGTATSGTATTTSVWPGCGTRMWSTQW S SV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAKGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRLSAIiLALASKVTLPPHYRYGMSPP 
GS VADKRKNPP W I RRRPVWEP X S DE D WYLFCGDTVE I LEGKDA 
GKQGKWQVIRQRNV^VVVGGLNTHYRYIGKTMDYRGTMIPSEAP 
LLHRQVXLVDPMDRKPTEI EWRFTEAGERVRVSTRSGRI IPKPE 
FPRAIXvI^BTWIDGPKDTSVEDALERTYVPCLKTLQEEVMEAM 
G I KETR \NTRRS IG I E PGAEQLLPNFCPS l*EG 


7004 


121 


2285 


FLLPV^TSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\PKRTLKTQLG/YYCRVRPLGFPDQECCIEVINNTTVQLHTPE 
GYRIjNRNGDYKETQYSFKQVFGTHTTQKELFDWANPLVNDLIH 
GKNGLLFTYGVTGSGKTHTMTGSPGEGGLLPRCLDMIFNSIGSF 
QAKRYVFKSNDR1ISKDIQCBVDALLERQKREAMPNPKTSSSKRQ 
VDPEFADMITVQBFCKAEEVDEDSVYGVFVSYIBIYNNYIYDLL 
EEVPFDPINPNLHNIiMCFVKI KNHNMYVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNIKLVQAPLDADGDNVL 
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! SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=»Serine # T=Threonine, V=Valine, 
W«Tryptophan, Y«Tyrosine, X --Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








QEKEQITISQLSLVDLAGSERTNRTRAEGNRLREAGNINQSLMT 
LRTCMDVLRENQMYGTNKMVP YRDS KLTHL FKN YFDGEGKVRM I 
VCVNPKAEDYEENI^VTTOPAEVTQEVEVARPVDKAICGLTPGRR 
YRNQPRGP \ IGNE PLVTD WLQS FPPLPSCE I LDINDEQTLPRL 
IEALEKRHNLRQMMIDEFNKQSNAFKALLQEPDNAVLSKENHMQ 
GKLNEKEKMISGQKLEIERLEKKNKTIjEYKIEIIiEKTTTIYEBI) 
KRNLQQELETQNQKLQRQFSDKRRLEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAI VTE PKTE KPBRP SRBR 
DREKVTQRSVSPSPVPVSYL 


7005 ^ 


63 


876 


rnmalyqrwrclrlqglqacrlhtawstpprWlaerLglfeel 

W AAQVKRLASMAQKE PRT I KI S L PGGQK I DAVAWNTT PYQLARQ 
I S STLADTAVAAQVNGEPYDLERPLETDSDLRFLTFDS PEGKAV 
FWHSSTHVI/3AAABQFIiGAVLCRGPSTEYGFYHDFFLGKERTIR 
GSELPVLERICQELTAAARPFRRLEASRDQIjRQLFKDNPFKLHIi 
IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGOIGGLKLLSNSSS 
LWRSSG 


; 7006 


22 


898 


NAFGRKSTAVKMAAAAMIjQVLpVI LLLLGA¥1pSPLS ffsa&pat' ^ 

VAAADRSKWHIP1PSGKNYFSFGKILFRNTTIFLKFDGEPCDLS 

LNITWYLKSADCYNE I YNPKABE VELYLE KLKEKRGLSG KYQTS 

S KLPQNCSELFKTQTFSGD FMHRLPLLGE KQEAKENGTNLTFIG 

DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKENSLSNLFTMT 

VEVKGPYEYLTLEDYPLMIFFMVMCIVYVLFGVLWIAWSACYWR 

DLLRIQFWIGAVIFLGMLEKAVFYAGFQ 


7007 '"■ 


2 


1001 


AMTVSGPGTPEPRPATPGASSVEQLRKEGNELFKCGDY'GGAIiAA " " 
YTQALGLDATPQDQAVLHRNRAACHLKLEDYDKAETEAS kai e k 
DGGDVKALYTUISQALEKIjGRIJJQAVLDLQRCVSLEP knkvfqea 
IiRNIGGQIQEKVRYMSSTDAKVEQMFQILLDPEEKGTEKKQKAS 
QNIiWLAREDAGAEKI FRSNGVOLIjQRLLDMGETDLMLAALRTL 
VGICSEHQSRTVATLS ILGTRRWSILGVESQAVSIiAACHLLQV 
MFDALKEGVKKGFRGKEGAIIVGEWKQVWGLLDVTVMEGMGLSQ 
PGQFFGDQTCSCRLFGIRFGDI ill 


7008 


70 


i4?e 


CRSALGHERPPPAHLPAGGRRLQTCPRS CRWL&RFPSGLPPGPR " 
SPPPIAGPGQKMVQKKPAELQGFHRSFKGQNPFELAPSLDQPDH 
GDSDFGLQCSARPDMPASQPIDIPDAKKRGKKKKRGRATDSFSG 
RFEDVYQLQED VLGEGAHARVQTC INLI TSQE YAVKI 1 EKQPGH 
I RSRVFREVEMLYQCQGHRNVLEL I E F FEBE DRFYLVFE KMRGG 
SILSHIHKRRHFNELEASWVQDVASALDFLHNKGIAHRDLKPE 
N I LCEHPNQVS P VK I CD FDLGSG I XLKGDCS P ISTPELLTPOGS 
AE YM7VPEVVEAFS EEAS I YDKR CDLWSLG V I LYILLSG YPPFVG 
RCGSDCGWDRGEACPACQNMLFES I QEGKYE FPDKDWAHI SCAA 
KDLISKLLVRDaKjQRLSAAQVLQHPWQGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNE 


■ 7009 


1 


626 


ARQLRNSWVDDFVAAPLI PLSQQI ptgnslyes yykqvdpaytcT" 
RVGASEAALFLKKSGLSDX 1LGKI WDLADPEGKGFLDKQGFYVA 

WAVRVEEKAKFDGIFESLLPINGLLSGDKVKPVLMNSKLPLDVL 
GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLS PLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
GSVRAALVDQSGVLLAFADQPIKNWEPQFNHHEQSSEDIWAACC 
WTKKWQG I DLNQ I RGLG FDATCS LWLDKQ FHPL P VNQEGDS 
HRNVTMWLDHRAVS QVNRINETKHSVLQYVGG 


7011 


3 


994 


RIQTLPNQNQSQTQPLLKTPPAVLQPIAPQTTF6VQTQPQPQSL 
LQAQ ISAAS ITPLLQTQPQPLLQQ PQQKAGLLQPPVRI VSQPQP 
ARRIiDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERS PRRERERS PRRVRRWPRYTVQFSKFSLDCPSCDMM 
ELRRRYQNLYIPSDFFDAQFTWVDAFPLSRPFQLGNYCNFYVMH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

L»Lysteine ( D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Hietidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X~Unknown, +«Stop 
Codon, /opossible nucleotide deletion, 
\-possible nucleotide insertion) 








reveslekwmaildppdadhlVsakvmlmaspsmedlyhkscal 

AEDPQSLRDGFQHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 
DPEKD PS VLI KT \AI RCCKALTG 


7012 


1 


2661 


RRAGSVKRGEAkLFGPTERQSE^PLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGSE VAAQPAGLSG PAEVGPGA 
VGERTPRKKEPPRAS PPGGLAB PPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSR R KRAKVE YRBMDESLANLSEDE YYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDI ISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATIiQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
ATPRKGNYVADLGAMWTGLGGNPMAWSKQVNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQEPNRLLEATSYIiSHQLDPNVLNNKP 
VSLGQALBWIQI^EKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NLKEKIKELHQQ YKEAS EVKP PRD ITAEFLVKSKHRDLTALCKB 
YDELAETQGKIiEEKLQELBANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLS LKHWDQDDDFE FTGSHLTVRNGYSCVPVALAEG 
LDIJCLNTAVRQVRYTASGCEVIAVWTRSTSQTFrYKCDAVLCTIi 
PLGVLKQQP PAVQFVP PLP EWKTSAVQRMGFGNLNKWLCFDRV 
FWD PS VNLFGHVGS TTASRGE L F L FWNL YKAP I LLAL VAGEAAG 
IMENISDDVI VGRCLAILKGI PGS SAVPQPKETWSRWRADPWA 
ROSYS YVAAGSSGNDYDLMAQP ITPGPS I PGAPQ P I PRLFFAGB 
HT I RNYPATVHGALLSGLR EAGR IADQFLGAMYTLPRQATPGV P 
AQQSPSM 


7013 


l 


2661 
• 


RRAGSVKRGfciARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAG LEG PAEVGPGA 
VGERTPRKKE PPRAS PPGGLAE PPGSAGPQAGPTWPGSATPME 
TG I AE TP EG \RRTSRR KRAKVE YREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPOAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDI ISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQ LE AP YNSDTVLVHRVH S YLE RHG L I NFG I YKR I KPL 
PTKKTGKVI I IGSGVSGLAAARQLQS FGMDVTLLEARDRVGGRV 
ATFRKGNY VAD LGAMWTGLGGNPMAWS KQ VNMELAKI KQKCP 
LYEANGQAVP KEKDEMVEQE FNRLLEATSYLSHQLDFNVLNNKP 
VSIiGQALEWIQIiQEKHVKDEQIEHWKKIVKTQBELKELLNKMV 
NLKEKIKELHQQYKEASEVKPPRDITAEFLVKSKHRDLTALCKE 
YDEIJ^TQGKLEEKMBLEANPPSDVYLSSRDRQIIjDWHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LD I KLNTAVRQVRYTASGCB VI AVNTRS TSQTFI YKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWD P S VNL FGHVGS TTAS RG ELFL FWN LYKAP I L LAL VAGEAAG 
IMENISDDVI VGRCLAILKGI FGSSAVPQPKBTWSRWRADPKA 
RGSYSYVAAGSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HTIRNYPATVKGALL5GLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETMALPQEGSLARI PETSLDCLENTLGVEEQRHETSDHEABEPD 
CI ISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVBWEM 
PLATDS PTS DPTE WNGISS QPQ VPFHPNLQKS Q YYSTVGGSHP 
HSEQYPDLLPLBARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
S VSASR WKPRQSS PQLHNLAS YTKKHHTSS VYS I S ERLEMKPG 
PQAQGLVMEAATHS QGDGSTDLDS KLTQQLI EFEKSLAGPGTE P 
DKILRHFSIMDFNSEKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTP VSTS PHLLVDQNLKPAPPLWRPS RPAPL P PS AQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
RDLDMYSRAQEELNLMLEEKQDES SRAETLEDLKFCESNIESLN * 
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! SEQ 
ID 

I NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, O Cysteine, D«Aspartic Acid, Eo 
Glutamic Acid, F«Phenylalanine, GoGlycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutaraine, RcArginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y«= Tyrosine, X=Unknown, *»Stop 
Codon, /*=possible nucleotide deletion, 
\apoesible nucleotide insertion) 








melqqlremtllissqssslvapsgsvsaenpbqrmbekrakvie 
elilqterdyirdlemcierimvpmqqaqvpnidfeglfgnmqmv 
ikvskqliiaaleisdavgpvflghrdelegtykiycqnhdeala 
lleiyekdekiqkhlqdsladlkslynewgctnyinlgsfxikp 
vqrvmryplllmellnstpeshpdkvpltnavliavkelnvnxne 
ykrrkdlvlkyrkgdedslmekisklnihsiikksnrvsshlkh 
ltgfapqi kde vfeete knfrmqerli ksf irdlslylqh ires 
acvkwaavs mwdvcm erghrdleqfervhryi sdqlftnfker 
terlvisplnqllsmftgphklvqkrfdklldfyncteraeklk 
dkktleelqsarnnyealnaqiildelpkfhqyaqglftncvhgy 
abahcdfvhqaleqlkpliisllkvagregnliaifheehsrvlq 
qljqvftffpeslpatkkpferktidrqsarkpllglpsymlqsb 
e lras llary p p e kl fqaernfnaaqdld vsllegdlvg v i kkk 
dpmgsqnrwlidngvtkgfvyssflkpynprrshsdasvgshss 
tbsehgsssprfprqnsgstl^fnpnXsNmavsftsgscqkqpq 
daspppkewdqgtlsaslnpsnsesspsrcpsdpdstsqprsgd 
sadvardvkqptatprsyrnfrhpeivgysvpgrngqsqdlvkg 
cartaqapedrstepdgseaegnqvyfavytfkarnpnelsvsa 
nqklkilefkdvtgntewwlaevngkkgyvpsnyirkteyt 


7015 


1842 


513 


RQAWHE \ VAAPS WRGARLVQS VLRVWQVGPHVARERV1 P FSSLL 
GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSRVLRWLLGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQALGVITEKETQVILLDTP 
G 1 1 S PGKQKRHHLELSLLEDPWKSMES ADL VWLVDVS DKWTRN 
QLS PQLLRCLTK YSQIPS VLVMNKVDCLKQKSVLLEIiTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKEI FMLSALSQBDVKTLKQYLLTQAQPGPWE YHSAVLTSQTPE 
EI CAN I IREKLLEHLPOE VPYNVQQKTAVWEEG PGGELVIQQKL 
LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


701* 


167 


2513 


I LNAPKPPP PRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQ KLVSQ I EDAMRKAGVAHSKS SKDMESHVFLKAKTRDE YLS 
LVARLI IHFRDIHNKKSQASVSDPMNALQSLTGGPAAGAAGIGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSSSSSSRRRYSSSSSSSNSKQ 
FQAGX3SAMOX3\QFO^\ WOX2QO^L\OXX)OX2QQQHIj I KLHHQNQQ 
QIQCXX20X2U2RIAQLQI^0^2QQQOX2QQQQQQQQALQAQPPIQQP 
PMO^PQPPPSO^PQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
IiPPQSQTQPLVSO^QALPGQMLYTQPPLKFVRAPMVVQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQSSLPMLSSPSPGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QS PVTARTPQNFS VPS PGPLNTP VNPSS VMS PAGSSQAEEQQ Y 
LDKliXQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDXLTDP 
SKROTLKTLQKCEIALEKLKNDMAVPTPPPPPVPPTKQQYLCQP 
LLDAVLANIRSPVFNHSI/YRTFVPAMTAIHGPPITAPVVCTRKR 
RLEDDERQS X PS VLQ GE VARLDPKFLVNLDPSHCSNNGTVHL I C 
KLDDKDLPSVPPLELSVPADYPAQSPLVflDRQWQYDANPFLQSV 
HRCMTSRLLQLPDKHSVTALLNTWAQSVHQACLSAA 


7017 


1 


1785 


INLGNTCYMNSVI *ALFMATDFRRQVLSIiNLNGCNSI*MKKLQHL 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFLLDR 
LHE EE KI LKVQAS HKPS E ILE CS ETSLQE VAS KAAVLTE TPRTS 
DGEKTLIEKMFGGKLRTH1RCLNCRSTSQKAEAFTDLSLAFWPS 
YS LEYMS CPDCS QS PS X QIXSGIiMQASVPG PSEE PVVYNPTTAAF 
I CDS LVNEKTI GS PPNEF YCSENTS VPNBSNKILVNXDVPQKPG 
GETTPSVTDLLNYFLAPEILTGDNQYYCENCASIiQNAEKTMQIT 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVIiELPVKRlTS 
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SEQ " 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end' 
nucleotide 
location 
corresponding 
tn fir-fit* 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
{AsAlanine, C»Cysteine, D=Aspartic Acid, £» 
Glutamic Acid, F= Phenylalanine, G=Glycine, . 
H=»Histidine, I^isoleucine, K= Lysine, 
ij-jjeucine, M=Metnionine, NeAsparagme, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V-Valine, 
W-Tryptophan, Y«.Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








FS S L S E S WS VDVDFTDLS ENLAKKLKPSGTDE AS CTKLVP YLLS 
SVVVHSGISSESGHYYSYARNITSTDSSYQMYHQSEAIALASSQ 
SHLLGRDS PSAVFEQDLENKBMS KEWPLFNDSRVTFTS FQSVQK 
I TSRFP KDTAYVLLYKKQHSTNGLSGNNPTSG LW1NGDPPLQKE 
IiMDAITKDNKLYLQEQELNARARALQAASASCSFRPNGFDDNDP 
PQSCGPTGGGGGGGFNTVGRLVF 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEIIFQ 
P SLI GEEQAG I AETLQ Y ILDR Y PKDVQEMLVQNVFLTGGNTM YP 
GMKARMEKELLEMRPFRSSFQVQLASNPVLDAWYGARDWALNHL 
DDNEVW I TRKE YEEKGGE YJjKEHCASN I YVP I RLPKQASRS SDA 
QASSKGS AAGGGGAGSQA 


7019 


1048 


335 


APGGFLVTMVFPAPSPPWMLGCCSHEVTAGPPTLCK0MSALVAA 
RMRHIPLAPGSDWRDLPNIEVRLSDGTMARKLRYTHHDRKNGRS 
SSGALRGVCSCVEAGKACDPAARQFNTL I PWCLPHTGNRHNHWA 

SQGFPDTYRLFGNILDKHRQVGNAVPPPLAKAIGLEIKLCMLAK 
ARESASAK I KKEEAAKD 


7020 


1 


2154 


FADS KRKSVLLDKI KNLQVALTS KQQS LETAMSFVARNT FKRVR 
NGFLMRKVAVFFSNTPTRAS PQLRE AVLKLS DAG ITPLFLTRQE 
D RQL I NALQ INNTAVGHALVLPAGRDLTDFLENVLTCHVCLDI C 
NIDPSraFGSWRPSFRDRRAAGSDVDIDMAFILDSAETTTLFQF 
NBMKKYIAYLVRQLDMSPDPKASQHFARVAWQHAPSESVDNAS 
MPPVKVEFSLTDYGSKEKLVDFLSRGMTQLQGTRALGSAIEYTI 
ENVFESAPNPRDLKI WLMLTGEVPEQQLEEAQRVI LQAKCKGY 
FFVVLGIGRKVNIKEVYTFASEPNDVFFKLVDKSTELNEEPLMR 
FGRIiLPSFVSSENAFYLSPDIRKQCDWFQGDQPTKNLVKFGHKQ 
VNVPNNVTS S PTSNP VTTTKPVTTTKPVTTTTKP VTTTTKPVT I 
IN Q PSVKPAAAKPAPAKP VAAKP VATKTATVRP P VA VKP ATAAK 
P VAAKPAAVR P PAAAAAKP VATKP EVPR POAAKPAATKPATTKP 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLLAGQTYHVAVVCYLRSQVRATYHGSFS 
TKKSQP PP PQ PARS AS S S TINLMVSTEPLALTETDICKLPKDEG 

TCRDFILKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEICVCA 
PVLAKPGVISVMGT 


7021 


2 


338 


VNAVSFFPNQYAFATGSDDATCRIiFDLRADQELLLVSHDNIICG 
ITSVAFSKSGRLLLAGYDDFNCNVWDTLKGDRAGVLAGHDNRVS 
CLGVTDDGMAVATGSWDS FbRIWN 


7022 


2 


856 


VYIGSFWSHPLLiPDNRKLFEtAEEQDLFRDiQSLPRNAALRKUi 
DLI KRARLAKVHAY1 ISSLKKEMPSVFGKDNKKKELVNNLAEI Y 
GRIEREHQISPGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHD IAQLMVLVRQEE SQRPI QMVKGGAFEGTLHG PFGHGYG 
EGAGEGIDDAEWWARDKPMYDEI FYTLSPVDGKITOANAKKEM 
VRSKLPNSVI/jKIWKLADIDKDGMLDDDEFALANHLIKVKLEGH 
ELPNELPAHLLP PSKRKVAE 


7023 


2 


748 


amvfggwpyvpqyrdirrtq^adgfstwclvlLvanilrTI^ - 

wfgrrfespllwqsaimiltmixmlklctevrvanelnarrrsf 

taadskdeevkvaprrsfldfdphhfwqwssfsdyvqcvlaftg 

VAG YI TYIjS I DS ALFVETLGFLAVLTEAMLGVPQL YRNHRHQS T 
EGMS I KMVLMWTSGDAFKTAYFLLKGAPLQFSVCGLLQVLVDIA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTGVTG WAQVWMFGGGGVtSSGEQLQMPVKPERGLGPSDGWLV " 

SSRRGSPGTVI/5LPFWIiTP\^VSRSIRSMLLLTRSPTAWHRLS 

QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 

RLLITGLFGAGLGGAWIiALRAEKERLQQQKRTEALRQAAVGQGD 

FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDI CPDELBKLVQV 

VRQLEAEPGLPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGL 
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any 
ID 

NO: 


rreaictea 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H»Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PeProline, Q=Glutamine, R=Arginine, 
SsSerine, T*Threonine, VaValine, 
WaTryptophan, YoTyrosine, X=0nknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








TGSTKQVAQAfiHSYRVYYNAGPKDEDQDYIVDHSIAIYLLNpDG 
LFTDYYGRSRS AEQI SDSVRRHMAAFRS VIiS 


7025 


IB J« 


! 832 


ERNSPIGNNENIj*K\HSLDCLCFRGDWEGNTQPQTLQDNQBECF 
KQVIRTCE KRPTPNQHTVFOTiHQRLbrrGDKLNEFKELGKAPlSG 
SDin\5HQLIHTSEKFCGDKECGNTPLPDSBVIQYQTVHTVKKTY 
E CKE CG KS FSLRSS LTGHKRI HTGEKPPKCKDCGKAFRFHSQLS 
VHKRIHTGEKSYECKBCGKAFSGG 


"7026 " 


328 


1146 


NPNP S IGDI KDIKKAAKSMLDPAHKSHFHPVTPSLVFLCFI FDG 
LHQ ALLS VG VS KRSNTWGNENEERGTP YASRFKDM PN F I ALE K 
S S VLRHCCDLLIG VAAGS SDKI CTSSLQVQRRF KAMMAS IGRLS 
HGE S ADLL I S CNAE S AI G W I SSRP WVGBLM FTFLFGD FES PLHK 
LRKSS*LPRKHR*QPINAVRMFI*DQCMDGSlAIjRAIVSEIPVFE 
EKKNNG*KGIGEIF*VWGCTLPPHYWGAVTTNVPKLSNSGKLLG 
QDEQPHIFG 


7027 


43 


954 


GRRI^QQQRPEDAEDGAEGGGKRGEAGWEGGYPEIVKENKLFEH 
YYQELKI VPEGEWGQFMDALREPLPATLR 1 TGYKSHAKEILHCL 
IOTKYFKELBDLEMDGQKVEVPQPLSWYPEBLAWHTNLSRX1LRK 
S PHLEKFHQFLVSETESGNISRQEAVSMI PPLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYLLVH 
QAKRLS S PCI M WNHDAS S I PRLQ I DVDGRKE ILFYDR 1 LCDVP 
CSGDGTMRKN I DVWKKWTTLNS LQLHGLQLRIATRGAEQL 


7028'- 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSRNKIRNS 
KKMQSWYSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 
EILLQGRLYLSENWICFYSNIFRWETTISIQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 




VLESNTEAKQATGTSSKLRHGTGQEKGRBGPRCPSGLAQLRLWG 
/ PCPHAGRETG PRAS AP I PGS *GHGWHW*RXDGRGERS EGPSAL 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA* RS LPGAAAS ERTEMTKERS P /R PCX2G YDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCLLYPLQS IMPE*QLR * GAHASPPTQG 
R*GKGGPRSPLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS \ 
AAPGQKRGHREA * QGPEPV/ WGRVTTHLQGPAG * TKPLGS \ RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRSTLTFSPQLS IPMVGKKP 
PBGTTAS FFP\RSCH3B * RKPPPSCPHAPALSLPHPLPLPLPPL 
PLPLPGAGT* HSARSGRPGQ SETGS LCHNCHHCP PHCPKCS PGG 
T 


7030 




521 


FVCFSAPGSGQGGKRRVKMELSAVGERVFAAEALLKRRIRKGRM 
E YLVKWKGWSQKYS TWE PE ENILDARLLAAFEERERE MEL YGP K 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGI R I P YPGRfiPQDL 
ASTSRAREGLRN \RVCPRQRAAPAPAAP \PRRGPSGPGPRPG* G 
PGLHFPGPGGPSKHGFVPASEQHQHQQHLPRRGPSGPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/CKPS/RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCIiA 
PRRLTEPPALTVSPVGRAAPSGAL*PSGRACSACSHRLAPEAAL 
S AAAPR PSLGSGQNASGLPAAS LPPQDS SQPHKTVPS PARS VP P 
LGAQARAAPPRLWCPRALVSG* EAS PEAVS VAAGP PVPGPT PST 
S G STASH S RRGC * S PR* TPAP PRRDHGRS AAFE VLTAAASAQPC 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHLAPL*QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
BPWMKRQFGRLHSLFWKSWQKMNSFLLTPKLDTSLMSGMRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 


7033 


689 


815 


RSRDCLSS SATSNRARRS KCSGPKRATPLDSGPGP *APPGPSSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DaAspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HaHistidine, Islsoleucine, K«Lysine, 
L»Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S«Serine, T=Threonine , V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCtf^SVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT+SSSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLI PLGLLCALL PQHKGA PG P DG S APDPAHY R ER V 
KAM F YHAYDS Y LENAFPFDELRPLTCDGHDTWGS FSLTLI DALD 
TLL\TLFYFQI LGNVSE FQRWEVLQDSVDFDIDVNASVFETNI 
RWGGLLSAHLLSKKAGVETVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMP YGTVNLLHGVNPGETP VTCTAG IGT FI VEFATLSS L 
TGDP VFEDVARVALMRLW ES RS D I G LVGNH I DVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
YLWVQM YKGTVS MP VFQS LEAYWPGLQS LIGD I DNAMRTFLNY Y 
TVWKQFGGLPEFYNI PQGYTVE KREG Y P LRPEL I ESAMYL YRAT 
G D PTLL ELGRD AVES I E KX S KVECG FAT I KDLRDHKLDNRMES F 
FEAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWBPPARPGTLPSPENHDQARERKPAKQKVPLLS CPS 
QPFTSKLALLGQVFLDSS * PLDNFFIFIFLRLNYNKXtLLAI IKK 
K 


7035 

• 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAkVRERV 
KAMFYHAYDS Y LENAF P FD ELRPLTCDGHDTWG S F S LTL I DALD 
TLL \ TLFYFQ I LGNVSE FQRVVEVLQDS VDFDI DVNAS VFE TNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQT PTGMP YGTVNLLHGVNPGETPVTCTAGIGTF I VEFATLS SL 
TGDPVFEDVARVALMRLWES RSD IGLVGNHIDVLTGKWVAQDAG 
I GAG VDS YFE YLVKGAI LLQDKKLMAM FLEYNKAIRNYTRFDDW- 
YLWVQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYY 
TVWKQFGGLPEFYNI PQGYTVE KREGYPLRPELIES AMYLYRAT 
GDP TLLELGRDAVES I E KX S KVECG FAT I KDLRDHKLDNRMES F 
FLAETVKYLYLLFDPTNFIHNNGS TFDAVITP YGECILGAGG YI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMRBPySLKRSRSKFQ 
KNTVSSGPWEPPARPGTIiFSPENHDQARERKPAKQKVPLLS CPS 
QPFTSKLALLGQVFLDSS*PLDNFFIFIFLRLNYNKLLLAIIKK 
K 


7036 


442 


761 


CLAPLFSCFQIINIiHLAPSGRLRWAWLRGPGRN*LPGEGPSIPT 
RNW* ERKAGCSQPC/ PAQQHHGRPPGVS PLPRD PHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW* ERKAGCSQPC/ PAQQHHGRPPGVSPLPRDPHPTTLRPLP P 
PPPPPPPPPRRPPRNRRPG 


7038 


15S 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
Q YNKLLE KS D LH S VLAQ KLQAE KHD VPNRHE I S PGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREN 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQI TFTALEGKXJ?XTTEENQELVTRWMAEKAQEANRLNARE*KR 
LQEAAS PAAERACRS S KGTS TSRTG 


7039 


155 


891 


GAGAASDMSSGLRAADFPRWKRH I S EQLR RRDR LQRQAFEE I I L 
QYNKLLEKS DLHSVLAQKLQAEKHDVPNRHE I S PGHDGTWNDNQ 
LQEMAQLRI KHQE ELTELHKKRGELAQ \RV I DLNNQMQRKDREM 
QMNE AKI AE CLQT I S DLETE CLDLRTKLCDLERANQTLKDE YDA 
LQ I TFTALE GKLRKTTEENQELVTRWMAE KAQEANRLNARE * KR 
LQEAAS PAAERACRS SKGTSTSRTG 


7040 


34 


789 


KX TP PRR PHRCSSGHGS DNS5 VLS GELP PAMG KTAL F YHSGGS S 
GYESVMRDSEATGSASSAQDSTSENSSSVGGRCRSLKTPKKRSN 
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ID 

i NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
loeafcion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=*Histidine, I^Isoleucine, KsLysine, 
L^Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, ReArginine, 
S=Serine, ^Threonine, V=Valine, 

LJMT^mt*/\yt^AM V ^ f T*» ruin * MA. V i +m . 

wairypcopnan, x=Tyrosxne, XsunJcnown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








f otjyKrtKjjx f t\u&uu i £>£> PvRKPPNiTGvRWVtJy PIiRSSPRGliG 
E P FE I KVYE I DDVERLQRRRGGAS KEAMCFNAKLKI LEHRQQRI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEPDLEQVWELDSIiE 
YLEALECVTERLESR VNFCKAHLMMITCFD IT 


7041 


l 


567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL 
NDGYDWGRliNLQSVTEQSSLDDPLATAEIiAGTEPVAEKUJ I KFV 
PAEARTGLLS FEESQRI KKLHEENKQ PLC I PRRPNWNQNTTP E E 
LKQAEKDNFLEWRRQL\VRI»EEEQKLILTPFERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 


7 


345 


PIHMAAAALRAbl\l^P^PHIQGYLLLSASHG\ATSLHTKQAL 
PLETVTMYTV1PKSKWLVKPDTQYPYSENLDEFKRLAENSASN 
DDLLMAEVAISDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDSDSEEDLVSYGTGLEPLEEGERPKKP I PLQDQTVRD 
E KGRY KR FHGAFSGG FS AGYFNTVGS KEG WTPSTFVS S RQNRAD 
K5 VLG PEDFMDEEDLS EFG I AP KAI VTTDD FAS KTKDR I REKAR 
QLAAATAP I PGATLLDDLITPAKLS VGFELLRKMGWKEGQGVGP 
RVKRRPRRQKPD PGVKI YGCALP PG SSEGS EGEDDD YLPDNVT F 
APKDVTPVDFTPKDNVHGIAYKGLDPHOALFGTSGEHFNIiFSGG 
S ERAGDLGE IG LNKGRKLGISGQAFG VGALEEEDDDI YATETLS 
KYDTVLKDEEPGDGLYGWTAPRQYXNQKESEKDLRYVGKILDGF 
SLASKPLSSKKIYPPPELPRDYRPVHYPRPMVAATSENSHLLQV 
LS ESAG KATPDPGTHS KHQLNAS KRAELLGETP I QGS ATS VLEF 
LS QKDKER I KEMKQATDLKAAQLKAR SLAQNAQS SRAQPS PAAA 
AG HCS WNMALGGGTATLKASNFKP FAKDPE KQKRYDE FLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THAXEEDDS DQVEVPRDQENDVGDKQSAVKMKM FGKLTRDTFE W 
HPDKLLFQ / RLVGLPRVKRDKYS VFNFLTL PBTASLPTTQASSE 
KVSQHRGPDKS RKPSR WDTSKHEKKEDS ISE FI>RIARSKAEPPK 
QQSS P L VNKEE EHAP ELS AN 


7044 


276 




cj v ucrAA^KIV.VAJLJijXJjiijVgyAGNIIPRLYIiLITVGWYVKS 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG 
EPTDEETTGDiSDSMDFVIjLNFAEMNKLWVRMQHQGHSRDREKR 


7045 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQ WEKTD I EGTLFVYRRSAS P YHG FTI VNRLNMHNLVE PVNK 
DLEFQLHEPFLLYRNASLS I YS IWFYDKNDCHRI AKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


"'" 7046 


3 




KAMQWEKTD I EGTLFVYRRSAS PYHGFTIVNRLNMHNLVE PVNK 
OLE FQLHE P FLL YRNAS LS I YS IWFYDKNDCHRI AKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 


103 


486 


QMKI EKCGWSEGLTS I KGNCHNFYTAIS KDVTYKELKNLLNSKN 
X I'UJ ±uv Kn AHfll JJB lyM fob XNVPIjDEVGEALQMMPRPF KE KY 
NEVKPSKSDS/IVFSYLAGVRSKKALDTAISLGFHSYYER 


7048 


92 


627 


FFCLTLLS S WD YRHHATRRVI SS P VFTMEDSGKTFS SEEEEANY 
WKDIJ^TYKQRAENTQEELREFQEGSREYEAELETQLQQIETRN 
RDLLS ENNRLRME LET IKEKF EVQHS EGYRQI S AL EDDLAQTKA 

I KDQIiQKYIRELEQANDDIiERAKRATDHGLSKTFE \QRLN\ QAI 
EKKW 


7049 


393 


938 


KRTGSAS YOGP P PGLCGPATXASVAQRCSS VGKI PARRCYEDEL 
VPVFEAVGRIYELRLMMDFDGKNRGYAFVMYCHKHEAKRAVREL 
NNYEIRPGRLIX3VCCSVDNCRLFIGGIPKMKKREEILEEIAKVT 
EGVLDVTVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7050 ' 


393 


93 8 


KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKIPARRCVEDEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, B= 
vj j.ui_ ciuiic Acia, r-rnenyialanine, G=Glycine, 
H=»Histidine, I«Isoleucine, K=Lysine, 
L-Leucine, M=Methlonine, N«=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
SeSerine, T=Threonine, V=Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, ♦-Stop 
Codon. /oDoasible nucl^nt*^^ h<qT 

\*possible nucleotide insertion) 








VPVFEAVGRI YBLRLMMDFDRKTJPnvti pvmvpuvup *. vu mrp-cft — 

NNYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREEILESIAKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCmiT^PlfT.Tawif 
ASSLWG 


7051 


119 


81* 


KKMNLAE I CDNAKKGREYALLGN YDSSMVY YQGVMQQ I QRHCQ S 
VRDPAIKGKWQQVRQELLEEYEQVKSIVGTLESFKIDKPPDFPV 
S CQDEP PRDPAVWP PP VPAEHRAPPQ 1 BR/ RQS RSKTS EERNGR 
SRS PGTCRPST\ PISKSEKPSTSRDKD YRARGRDDKGRKNMQDG 
ASDGEM PKFDGAG YDKDLVEALBRDI VSRN PS IHWDDIADLEEA 
KKLLREAGVLPMWM 


7052 
7053 


467 


715 


fa i-j^KVajwfcKbbNfEkM 1 X Y FDS YAHFG 1HEEMLKDE VRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMJPAAROGPRR 


7054 


467 


715 


S C PGRGKMS KUbNPEEMTSRD YY FDS YAHFG X HEEMLKDBVRTL 
TYRNSMYHNKHVFKDKVVU)VGSGTG1LSMPAARCX3PRR 




1 


1036 


GTSQRSRETDARRJ^AGAEPTARtPWPAALEEWPSCPCEPLGPG 
KKLKWUArffiYDEKLARPRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPBEALPELPPGEPEFRCPERVMDLGIjSEDHFSRPVGLFIiASD 
VQQIJIQAIEECKQVILELPEQSEKQKDAVVRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFYKEKSK5VKQTCDKCNTIIWGLIQTWYT 

ctgcyyrchskclnliskpcvsskvshqaeyelnicpetgldsq 

LdKUiautAi!'!/ C5/DGVVP5EARQCDYTGQYYCSHCHWNDLAV 
IPARVVHNWDFEPRKVSRCSMRYIJUjMVSRPVLRLREIN 


7055 


2 


527 


DSRRVS WRS WLANE / WGK& L<iXF I WLSMNVLL F WKTFLL YNQGP 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 
SQKVPSRRTRRLLDKSRTFH ITCGATICI FSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVLFL 
M 


705S 
7057" 


2 


527 


DSRRVS WRSWLANE/WGKHIjCLFIWL&MNVLL^ 

fiXHx JjnQMLC/ AI^LSRASASVLNLNCSLILLPMCRTLLAYIjRG 

SQKVPSRRTRRLLDKSRTFHITCGATI CI FSGVHVAAHLVNALN 

FSVNYSEDFVBD«AARYRDEDPRKLLPrrVPGLTGVCMBVVLFL 
M 




13*6 


431 

r 


SGEVPSQASLRGFFXEDEPGCFGEGENLPEALQNIQDEGTGEQL 
SPQERISEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCREQGKTFYRNSQLI FHQRTHTGETYFQCTICKKAFLR 
SSDFVKHQRTHTOEKPCKCDYCGKGFSDFSGLRHHEKIHTGEKP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
DKHQRSHLGKKPFQ * PVTKLS FP I SI S QPSHKNTQLHQEELCLR 
GYPC 


7058 


1 


4*9 


FSG FGAVP DALGCRMSDLR I TEAFLYMD YLC FRALCC KGPPPAR 
PE YDLVCIGLTGSGKT<!TiT.QK T .nc wc DnKnnrcpwuo t v» ,mn« 
NAILNVKE LGGADN I RKYWSRYYQGSQGVI FVLDS AS SEDDLEA 
ARN*SCTQLLQHPQLCTLPFLILA 


7053 
7060 " 


1 


1178 


WPAFPRQ PAAAAMDALLG TG PRRARGCLGAAG PTS SGRAARTPA — 
AP WARFSAW LECVCWTFDLELGQALE LVY PNDFRLTDKEKS S I 
CY13FPDSHSGCLGDT0FSFRMR0CGGQRS P WHADDR HYNSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALLS LIAPE Y FDKLAPCLBAVCSE I DQW PAPAPGQTLNLP VM 
GVWQVRI PSRVDKSESS PPKQFDQENLLPAPWLAS VHELDLF 
RCFRPVLTHMQTLWELMLl^EPLLVLAPSPDVBSEMVLALTSCL 
QPLRFCCDFRPYFTIHDSEFKBFTTRTQAPPNWLGVTNPFFIK 
TLQHWPHI LRVG BP KM3GDLPKQVKLKKP FKV* RPWDTKP 




90 


1670 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLV^PSO'" 
YRFEHLVTQMKWRLQEGRGEAVYQ IGVEDNGLLVGLAEEEMRAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""" 
(A»Alanine, C=Cysteine, DoAspartic Acid, E= 
Glutamic Acid, FePhenylalanine , G=Glycine, 
HeHistidine, l»Isoleucine, K=Lysine, 
L=Leucine, MeMethionine, NaAsparagine, 
P=Proline, 0=Glutamine , R^Arginine, 
S«Serine, T=Threonine, V»Valine, 
W«Tryptophan, Y=* Tyrosine, X= Unknown , +=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LKTLHRMABKVGiu^ITVljkEREVDYDSDMPR^TBVLVRKVPD^ 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
LHE I QSGRTS SISFE ILGFNS KGEVHG INGTQWGQTLRMGW * * * 
RT * DGGRVWRLF E I V * MNALRGL * TS SAP LRKS MGNQLN* I KNG 
VKI KRQGHPGNGLGPGNS EG VGRAGRRH * G P WAI/3Q WNYS DSR 
TAEE ICE S S S KM I TF IDLAGHHKYLHTT I FGLTS Y C PD CALLL V 
SANTG I AGTTREHLGLALALKVPFFIWS KI DLCAKTTVBRTVR 
QLER VLKQPGCHKVPMLVTSBDDAVTAAQQ FAQS PNVTP I FTLS 
SVSGESLDLLKVFLNILPPLTNSKEQEELMQQLTEFQVDEIYTV 
PEVGTWGGTLSR*IDUATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPSPLGPPCLPVMDPETTLEEPETARIjRFRGFCYQEVAGPRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQRPGSPEEAAALVEGLQHDP*ARMPSPLGPPCLPVMDPETTL 
EEPETARLRFRGFCYQEVAGPREALARIiRELCCQWLQPEAHSKE 
QMLEMLVLEOFLGTIi P PEI OAWVRGQRPGS PEEAAAL VEGLQHD 
PGQI,LG 


7062 


71 


744 


AKAGTbiLHRLrtWLSYFFClPKHKLKSSQKDKVRQFMACTQAGER 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGR YKDPQDENKIG VDG IQQFCDD LS LDPAS I S VLVIAWKFRAA 
TQCEPSRKEFIiDGMTELGCDSMBKLKALLPRLBQELKDTAKFKD 
FYQFTFTFAXNPGQKGLDL *MAGAYWJCLVLSGRFKFL YLWNTFL 
MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR* PELPPDMNSLEOAfifcLKAFERR 
LTEYIHCLQPATGRWRMLLIWSVCTATGAWNWLIDPETQKVSF 
FTSLWNHPFFTISCITLIGLFFAGIHKRWAPSIIAARCRTVLA 
E YNMS CDDTGKLI LKPRPHVQ* QSSLIVMGLKIAP1>RI SDTAKS 
HKGFLLRLDM 


7064 


300 


684 


rdtgsdpsstrrlcstcctgh*paepiasphpsrgtcppassas 
srrtgcwtcppesghaqarrsrrasasrwgargavrsavaargc 

SSRAGRWLETPGRRRGP PACAAAAGRLRGPAP * AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRSPGPSPKSAAP 
PLLTPLGAGRAGGSRANS 


lot* ■ 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQS EPKAPP PQKRSEAAFAS MAHS PVAVQ VPGMQNNIADPEBL 
FTKLERIGKGSFGEVFKG IDNRTQQVVAI KI I DLEBAGDEIEDI 
QQE I TVLSQCDSS YVTKYYGS YLKGSKLWIIME YLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASIiRSNVRAATMMQICDT 
YNQKHSLFNAMNRFIGAVNNMDO/TVMVPSLLRDVPLADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KENITMATE IGS P PRFFHMPRFQHQAPRQLF YKRPDFAQQQAMQ 
QIiTFDGKRMRKAVNRKTIDYWPSVIKYLENRIWQRDQRDMRAIQ 
PDAGY YNDLVPP IGMLNNPMNAVTTKFVRTSTNKVKCPVFVVRW 
TPEGRRLVTGAS S GE FTLWNGLTFNFETILQAHDS P VRAMTWSH 
NDMWMLTADHGG YVK Y WQ S NMNNVKM FQAHKEAI REAR F IHN I P 
FS WPI VMVKLFS KCXLGAEMHGIiCQFLGNFLHPl NTI FFFVFT 
HSPFCWAPF 


7068 


222 


816 


OTMKEYVLLLFIJujCSAKPFFSPSHIALKNMMLK1)M 
DDDDDDDDDDBDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCS DLG LTSVP TNI P FDTRM LDLQNN KI KB I KEND FKGLTS LY 
GLILNNN KLTKIHPKAFLTTKKLRRLYLSHNQLS E I PLNLPKSL 
AELRIHENKVKKIQKDTFKKK 


7069 


1147 


1765 


FRDHRRYPYVNEQSGESQWBFPDGEEEEEESQAQENRDETLAKQ 
TLKDKTGTDSNSTES SETSTGSLC KES FSGQVSSSSLMPLT P FW 
TLLQSNVPVLQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted f»nd 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Hietidine, I«Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
r-rronne, ysta-iucamme, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W*Tryptophan, Y=Tyroeine, X=tJnknown, *-stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EKTKKGRKDXAKKSKTKMPSLVKKWQSIQRELDEEDNSSSS^ED 
RVS TAQKR IEEWKQQQLVSGMAERNANFEA 


7070 


1 


547 


IX* l l^HaUbbAVvKA 1 AL i EuRLAUEtiENE KJjKUUARQKLPMDLLV 
LEDH KHHGAQS AALQKVKGQER VRKTSLDLRRE I IDVGG IQNL I 
ELRKXRKQKKRDALAASHE P PP E PEE I TGP VDEETFLKAAVEGK 
MKVI EKFLADGGSADTCDQFRRTAIjHRAS LEGHME ileklldng 
ATVDFQ 


7071 


2 


921 


argtlraletakkvgkvgangqkaagpsadsvtenkigsppktp 
vsnvaatsagpsnvgtelnsvpqksspfltrvpaypphseniqy 

FQD PRTQI PFEVPQYPQTG YYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7072 


2 


921 


ARGTLRAIxETAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP ' 
VSNVAATS AGPSNVGTBLN SVPQKSS PFLTR V PAYPPHSEN IQ Y 
FQDPRTQIPFEVPQYPOTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAP VYDSRR I WRPPMYQRDDI I RSNSL PPMDVMHSS VYQT 
SLRERYNSLIX5YYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQYHTQKAPLVS STLPVATQS PT PPS TLNRGEGS 


>073 


50 


504 


LAHGSFGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMP 
LVRYRKWI LGYRCVGKTS LAHQFVEGEFSEGYDPTVENTYSKI 
VTLGKDE FHLRXVDTAGQDE YS ILPYSFI I GVHGYVLVYSVTSL 
HSFQVIESLYQKLHEGHGK 


7074 


263 


1003 


VCPVLCSTRQEPGHSSLVTYFGKPTRRKEFLLGHCXAAGKPOJriS 
VDLETNYAELVIJ)VGRVTI^ENSRKKMKDCKIjRJCKQNERVSRAM 
CALLNSGGGVIKAEIENEDYSYTKDGIGLDLENSFSNILLFVPE 
YLDFWQNGNYFLIFVKS WSLNTSGLR ITTLSSNL YKRDITSAKV 
NWAJ^yUjEcIiKDMKKTRGRLYLRPELI^ 
AGVFFDRTELDRKE KLTFTESTHVEI 


707S 


598 


1005 


NYlNFFyRKEYPPHVOKVEjCNPVRLSRLQGVERlMkKTEBSESQ " 
VEPEI KR KVQQKRHCB T YQ PTP PLS PAS KKCLTHLEDLQRNCRQ 
AITLNESTGPLLRTS IHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7076 


279 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWS KGRKRKKPLRDSNAPK 
SPLTGYVRFMNERREQLRAKRPEVPFPEITRMLGNEWSKIiPPEB 
KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
Ryu/uucyA i nunc jus l ts v rvtLKo v c JL/ I Jr I F TEE FLtNHS KAREAEL 
RQLRKSNMEFEERNAALQKHVESMRTAVEKLBVDV1QERSRNTV 
LQQHLETLRQVLTSSFASMPLPBXGETPTVDTIDSYM 


7077 


3 


1119 


ooriuoiNir. x nijIjAIiKK. Jl UK. X v?r WiGSQ i SGS LiKSS IPVDVARQR 
ELKWLDMFSNVJDKWLSRRFQKVKLRCRKGIPSSLRAKAWQYLSN 
Siu^LLEQNPRKFEELBRAPGDPKWLDVIEKDLHRQFPFHEMPAA 
RGGHGQQD LYRI L KAYTI YRPDEG YCQAQAP VAAVLLMHMPAEQ 
AFWCLVQI CDKYLPG YYS AGLEAI QLDGB I F FALLRRAS PLAHR 
HLRRQRI DP VLYMTE WFMCI FARTLPWASVLRVWDMFFCEGVKI 
IFRVALVLLRHTiVSSVEKLRSCQGMYETMEQLRNLPQQCMOBDF 
LVHEVT^PVTEALIEREl^AQLKKWRETRGEIiQYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 


FQGQRMAGEQ KPSSNLLEQF 1 L )^KGTSGSAi'lI l Al#llS<l} VLE^JpG - 

VYVFGELLELANVQELAEGANAAYLQLI^^ 

S LP ELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQBGQLSQMARPLSTPSSSQ 
MQARKKRRGI IEKRRRDRINSSLSEIiRRLVPTAFEKQGSSKLEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, CoCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F^Phenylalanine, <3=Olycine, 
H*Histidine, I«Isoleucine, KoLysine, 
Leucine. M=Methionine, N=Asparagine, 
P=Proline, G=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








AEVTjQMTVDHLKMIiHATGGTGTHALLFQAS fiqqi f 


7080 


200 


595 


VQLP LE A PCLS LLS CRDHSGGNRDLS RRHRDCRV YGS PQDG I P Y " 
LTHPLCHQDVVSVGRLQIRALATPGHTQGHLVYLt/DGBPYKGPS 
CLFSGDLLFLSGCGEFPRKRBELGEBGETEVRAATVPWRALKP ' 


7081 


213 


506 


AVTEEEMIIxNSLSLCYHNKLIIiAPMVRVGTLPMRLLALDYGADI *~ 

VYCEEIilDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 

RWFQMGTS 


7082 


3 


1137 


APS RNTMLMAW CR GP VLLCLRQ G LGTNS FIiHGLGOE PFEGAk 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 
SSLENLRLPTLREESS PRELEDS SGDQGRCGPTHQGSEDPSMLS 
QAQ S AT S VE ERHVS PS CSTSRERPFQAGEL I LAETGEGETKF KK 
LFRLNNFGLliNSNWGAVPFGKIVGKFPGQI LRSSFGKQYMLRRP 
ALEDY WLMKRGTAI TFPKDINM I XiSMMD INPGDTVLEAGSGSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDLAKiCNYKHWRDSWKLSH 
VEEWPDNVDFIHKDISGATEDI KSLTFDAVALDMLNPHVTLPVF 
YPHLKHGGVCPVYWN I TQVI ELLD 


7083 


115 


541 


feSNAVQLTRMEVAMKSLSLLYPK^LSRHVSVRTSVVTQQUiSEP 
S PKAPRARPCRVSTADRSVRKGIMAYSLEDLLLKVRDTLMLADK 
PFFLVLEEDGTTVETEEYFQALAGDTVFMVIiQKGQKWQPPSEQG 
TRHPLSLSHK 


7084 


3 


522 


NS VS VS SQSRFLAS VPGTGVQRSAAADMAASTAAG KQR I PK VAK 
VKNKAPAEVQITAEQLLREAKERBLEIJjPPPPQQKITDEEELND 
YKLRKRKTFEDNI RKNRTVISNWI KYAQWEESLKEIQRARS I YE 
RALDVDYPJJITLWLKYAEMEMKNRQVNHARNIWDRAITTL 


708S 


243 


1499 


RQLARLRRRG WRS P FGGAPMAH I T I NQ YLQQVYEAI DSRDGAS C 
AELVS FKHPHVANPRLQMASPEEKCQQVLEPPYDEMFAAHLRCT 
YAVGNHD F I EA YKCQTV I VQS FLRAFQAH KE ENWAL PVMYAVAL 
DLRVFANNADO^LVKKGKSKVGDMIiEKAABLLMSCFRVCASDTR 
AGIEDSKKWGMLFLVNQIiFKIYFKINKLHIiCKPblRAIDSSNLK 
DDYSTAQRVTYKYYVGRKAMFDSDFKQAEEYLSFAFEHCHRSSQ 
KNKRMIL I YLLP VKMLLGHMPTVTBLLKKYHLMQFAEVTRAVS EG 
NLLLLHEALAKHEAFFI RCG 1 PL I LE KLKI IT YRNLFKKVYLLL 
KTHQLSLDAFL VALKFMQVEDVD I DE VQCILANLI YMGHVKG Y Z 
SHQHQKLWS KQNPFPP LS TGC 


708* 


256 


525 


I LAARMG KQNS KL R PEVMQDLLESTD FTEHE I QE W YKG FLRD CP 
S GHLS MEEFKKI YGNFFP YGDASKFAEHVFRT FDANGDGT I DFR 
EF 


7087 


166 


723 


LSGS SAGKVAAPCVPP SNHELVP I TTENAPKNWDKGEGASRGG 
NTRKSLEDNGSTRVTPSVQPHLQPIRNMSVSRTMEDSCELPLVY 
VTERI I AVS FPS TANEEN FRSNLRE VAQMIiKS KHGGN YLL FNLS 
ERRPD ITKLHAKVLEFGW PDIiHTPALEKI CS I CKAMDT WLNAH P 
HRCRVLHNKG 


7088 


104 


759 


GTSAAS PSS LLEMAjGE ITETCELYS S YVGLVYMFNLI VGTGALT " 
HFKAFATAG WLVS LVLLVFLGFMS FMTTTFVI EAMAAANAQ L»HW 
KRMBNLKEEEDDDSSTASDSDVLIRDNYERAEKRP ILSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLPYFCI IVYLYGDLAI YA 
AAVPFSLMQVTCSATGNDS CGVE ADTKYNDTDRCWG PLRRVD 


"7089 


33 


1775 


S VCWEDRYLKARMEES PLSRAPSRGGVNFLNVARTYI PNTJCVEC 
HYTLP PGTMPS AS D WIG I PKVE AACVRDYHTFVWS S VPESTTDG 
S PIHTSVQFQAS YLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRP^ELVTLEE ADGG^DI LLVVPKATVLQNQLDESQQERJNDLM 
QLKLQLEGQVTELRSRVQELERAIiATARQEHTELMEQYKGISRS 
HGEITEBRDILSRQQGDHVARILELEDDIQTISEKVLTKEVEIiD 
RLRDTVKALTREQEKLIX5QLKEV0^\DKEQSEAELQVAQQENHHL 
MLDIiKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVABLEP 



595 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«*Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T*Threonine , V= Valine, 
W=Tryptophan, YoTyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKEQLRC^QELAASSQQKATLLGEELASAAAARDRTIAELHRSR 
LE VAE VNGKLAELGLHLKEEKCQWS KERAGLLQS VE AEKDKI LK 
]jS AB I LR LEKAVQEB RTQNQV F KT E LAREKD SSLVQL S ES KREfi 
TELRSALRVLQKEKEQLOEEKQELIiE YMRKLEARLE KVADE KWN 
EDATTEDE EAAVGLS C PAALTDS EDES PEDMRLHPMAFVSVBTQ 
ASLLLGLE 


7090 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLPPGTMPSASDW1GIFKVEAACVRDYHTFVWSSVPESTTDG 
SPIHTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRB 
PRPMDELVTLEEADGGSDILLWPKATVLQNQliDBSQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERD I LS RQQGDHVAR I LELEDD IQT I S E KVLTKE VELD 
RLRDTVKALTREQEKLLG QLKE VQ ADKEQS EAELQVAQQENHHL 
NLDLKE AKS WQEEQSAQAQRIiKDKVAQMKDTLGQAQQR VAE IiEP 
LKBQLRGAQELAASSQQKATIiLGEEIiASAAAARDRTXAELHRSR 
IiBVAEVNGKIiAEIiGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 
LSAEILRLEKAVQEERTQNQVFKTELAREKDSSLVQLSESKREL 
TELRSALRVLQKEKEQLQEEKQBLLEYMRKLEARLEKVADEKWN 
EBATTEDEEAAVGLSCPAALTDSEDBSPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7091 


186 


1076 


EGMLTREHRCGRSEEQEIiEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAY 
ALHKQMRIVKPKV7ASMEEMATFHTDAYLQHLQKVSQEGDDDHPD 
SIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAIN 
WSGGWHHAKKDEASGFCYLNDAVLGILRLRRKFERILYVDLDLH 
HGDGVEDAFS FTSKVMTVSLHKFSPGFFPGTGDVSDVGLGKGRY 
YS VNVPIQDG I QDEKYYQ I CER YEPPAPNPG L 


7092 


522 


909 


KQQINEDQEESQKPRIiGEGCEPISKRQMKKLlKQKQWBEQRELR 
KQKRKEKRKRKKLERQCQMEPNSDGHDRKRVRRDWHSTLRLI I 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQASMVRMSFVIAACQLVLGLLMTSLTESSIQNS 
HCPQLCVCEIRPWFTPQSTYREA 


7094 


2 


508 


FVRSMHWGVGFASSRPCWDLSWNQSISFFGWWAGSEBPF3FYG ' 
DI IAFPLQD YGG INAGIX3SDP WWKKTLYLTGGALLAAAAYIiLHE 
LLVIRKQQEIDS KDAI I LHQ PARPNNGVPSLS P FCUKMETYLRM 
ADLPYQMYFGGKLSAQGKMPWIEYNHEKVSGTEFI I 


7095 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR 
SLECVSHEVDSHYCPSCLENMPSAEAKLKKNRCANCFDCPGCMH 
TLSTRATSISTQLPDDPAKTTMKKAYYLACGFCRWTSRDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPSQAGRRRSSRISFAGALFLTRFLLQELLLNNFC 
5AMSPAPDAAPAPAS ISLFDLSADAPVFQGLSLVSHAPGEALAR 
APRTS CS GSGERES PERKLLQGPMD I SEKLFCS TCDQT FQNHQE 
QREH Y KLDWHRFNLKQRLKDKPLLSALDFEKQS S TGDLSS ISGS 
EDSDSAS EEDLQTLDRERATFEKLSRPPGFYPHR VLFQNAQGQF 
LYAYRCVLGPHQDPPBEAELIjIiQNLQS KGPRDCVVLP4AAAGHFA 
GAIPQGREWTHKTFHRYTVRAKRGTAQGIiRDARGGPSHSAGAN 
IiRRYNEATLYKDVRDLLAGPSWAKALEEAGTILIiRAPRSGRSLF 
FK3GKGAPLQRGDPRL WDI PLATRRFTFQELQRVLHKLTTLHVYE 
EDPRBAVRLHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGG 
NEESPKQGSGSEGEDGFQVELELVELTVGTtiDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSQAVAAPLGPL 

ldbakapgqpelwkallaacragdvgvlxlqlaps padprvls l 
lsaplgsggftllhaaaaagrgswrlliieagadptvqcqdh 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRI ITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGZiETNNSNS ELPLRVGLKVAQGS PIA4GGQVSA 
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SEQ 
ID 
NO: 


Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residua of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C«Cysteine, D*Aspartic Acid, E*= 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K»Lysine, 
ij-iieucxne, M=Mecnlonine, N=Asparagme, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, TaThreonine , V»Valine, 
W=Tryptophan, Y=Tyrosine, X^unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








SNSFSRLHCRNANEDWMSALCPRLWDVPLHHLS IPGSHDTMTYC"" 
LNKKS P ISHEBS RLLQLLNKALPCI TRFVVLKWSVTQALDVTEQ 
I^GVRYUJIJIIWMLEGSEKNLHFVHMVYTTALVEDTLTEISB 
WLERHPREWI LACRNFEGLSEDLHEYLVACIKNI FGDMLCPRG 
EVPTLRQLWSRGQQV I VS YEDES SLURHHELWPG VP YWWGNRVK 
TBALIRYLETMKSCGR 


7098 


82 


956 


SSFLKRCRKVLGCWGIPSEQSLFSTLEEPRDKEIDNYCVMRLQT 
EARSGFWAPNRFPVNI CRMTAVDGDRGGSSRETCRCHFHPSLEA 
LVLLLQDWQPGGVGI CT SFLG ISW ALLDYHRALRTCLPS KPLLG 
LGSS VI YFLWNLLLLW PRVLAVALFSALFPS YVALHFLGLWLVL 
LLWVWLQGTD FMPDPSS E WLYRVTVATI LYFS W FNVAEGRTRGR 
AI IHFAFLLSDS ILLVATWVTHSS WLPSGI PLQLWLPVGCGCFF 
LGLALRLVYYHWIiHPSCCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRSIiARQGYHQIWAFPFLPSGATATWPAASRSRSlA 
ARSLPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSEEPGPGAD 
GAVLEVHVPQIGAGVSLPGI LiAAKCGAEVI LSDSSELPHCLE VC 
RQSCQMNIUjPHLQWGLTWGHISWDIjIJU^PPQDIIIiASDVFFEP 
EDFED I LAT I YFLMHKNPKVQLWST YQVRS ADWSLEALL YKWDM 
KCVHI P LE S FDADKEDIAES TLPGRHTVEMLVI S EAKDS L 


7100 


205 


O / X 


ANGGFWEAAPGSEVSLPLWVPTASHSKTTALGIGSAPPPHLSVLt 
FLFSFPPQLGDPLEAFPVFKKYDRNGLiWSIECKRVSGLEPATV 
DVmFDLTKlTWQTMYEQSEWGWKDREKREEMTDDRAWYliIAWEN 

C CT7 Dtf TV 17CUDD PTO rDH/^HWi rr vtvi 

V Jr VAT orlr K r JJ VKRGDE VTjXW 


7101 


2 


503 


WRGGPRRAKRIJVGGAVGWVLLVRdVHSVRAGGGRPPRAAbMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEIfiQANVI CI VYAVNNKHS IDK 
VTSRWI PLINERTDKDSRLPLILGGNXSDLVBYSR 


7102 


2 


503 


WRGGPRRAKRIAGGAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 
VR ILLVGE PRVGKTSLIMSLVSEEFPEEVPPRAEE I T1* PADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHS IDK 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVBYSR 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMKGQASSVNIAATASEKSSSSES 
LSDKGSELKKSFDAWFDVLKVTPBEYAGQITLMDVPVFKAIQP 
DELSS CGWNKKEKYSSAP 


7104 


1670 




Kjuvy isnitb V aiUiH^y WUJjBS PGCIjIjIiHP SLPEEERVD I III NWAGV 
MRCPHWTTETCFEMQFGVNHLGEAWAGAAPWVQAILPRRPPKVL 
GF* V* VKSDLF 1 1 LNPGHFLLTNLLLDKLKASAP SR I 1 NLS S LA 
HVAGH IDFDDLNWQTRKYNTKAAYCQS \ KLAI VLFTKELSRRLQ 
GSGVTVNALHPG VARTELGRHTGIHGS TFLQHHN \ WAHL LAAWS 
KS PRS WPAP AQHNTLAVAEE LA\ VI SG KYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREQPLPR 


7105 


765 


143 


GQMCRR PS PKSTS CLSMTCDLP / RGLQD PQ CLALFRVAVDKHQA 
LiUUtAnovUU VUKH Li trAii Y I VSRFLHIiQSPFLTQvHSJSQWQLST 
SQI PVQQMHLFDVHNYPDYVS SGGGFGPADDHGYGVS YI FMGDG 
M I TFH I SS KKSS TKTDSHRLGQHI EDALLDVAS LFQAGQHFKRR 
FRGSGKENS RHRCGFLS RQTGAS KASMTS TDF 


7106 


14 


1064 


GIiQAGHPHPRSASRIPEADTH\YSKLQRAFDSIWKDHKRMFGT 
YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRLEAFYGQCF 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFNLRRFM YTTP FTLEGR PRGBLHEQ YRRNTVLTTMHAF 
PYIKTRISVIQKEEFVLTPIEVAlEDPIKiCKTLQLAVAINQEPPD 
AKMLQMVLQGS VGATVNQG P LE VAQ VFLAE I PAD PKLYRHHNKL 
RLCFKEFIMRCGEAVEKNKRLITADQREYQQELKKNYNfCLKENL 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS 


7107 


1145 


591 


*I*WLQTGKKK 
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SEQ 
IP 
NO,* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

auu.no aula 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, CoCysteine, D«Aspartic Acid, E° 
Glutamic Acid, F« Phenyl alanine, G«Glycine, 
H=»Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, YoTyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion,. 
\«possible nucleotide insertion) 


7108 


1 




VKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNSSTFYIAF 
FLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLK 
QTWNNFMEI/3Y PLIQNWWTRRKVRQEHGPBRKI SFPQWEKDYNIi 
QPMNAYGLFDEYLEMILQPGFTTIPVAAFPIiAPIiIiALLNNI IEI 
RLDAYKPVTQWRRPLASRAKDIGIWYGILEGIGILSVITNAPVI 
AITSDFI PRLVYA YKYGPC&GQGEAGQKCM VGY VNASLS VFRI S 
DFENRSEPBSDGSEFSGTPLKYCRYRDYRDPPHSIiVPYGYTIiQP 
WHVLAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLGAAQEALSIQLQPKE ' 

TQPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 

SEKLATDTSTFEATSEGTLBLQQRKPKAERLRWSPAQEESFRQM 

WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGEKPYKC 

SDCG KTFKQ S S NLG QHQR I HTGEKP FE CNECG KAFRWGAHLVQH 

QRIH3GEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 

KA YG WCS EL I RHRR VHARKE PSH 


7110 


96 


697 


RLDNFSGFLVEVTKEERHIVKPLYDRYRLVKQMLTRASITPVIjG" 
S PSTKRRGQMLQ P 1 1 EGETAHFFEE I KE EEE DG VNLS SELG DM L 
KTAVQ VQSSLKNS ESDVEENQEKLALDLRLSSSRAASMPELLEQ 
LWKARAEKKKLRKTLREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
YKKIKAKLRIiLBVLISKQDSSKSl 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALELNEIjTAE 
LKRSIiPSTDTRLRPDQRYLEEGNlQAAEAQKRRIEQLQRDRRKV 
MEENNIVHQARFFRRQTDSSGKEWVnrTNNTYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRIilGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW 
FKNDQD IQLSBHFS VKVBQAKYVSMT I KGVTS EDSGKYS INI KN 
KYGGE KIDVTVSVYKHGBKI PDMAPPQQAKPKLI PASASAAGQ 


7113 


1 


824 


KCIiRQAWHEAPSSLAFTRWCSREERAEGGGNLHRS ITRDPKPPG 
IiRPSQRPMDDKKKKRSPKPCLAQPAQAPGTLRRVPVPTSHSGSL 
ALGLPHLPSPKQRAKFKRVGKEKGRPVLAGGGSGSAGTPLQHSF 
LTEVTDVYEWEGGLLNLLNDFHSGRLQAFGKECS FEQLEHVREM 
QEKIiARI*HFS LD VCGEEEDDEEEEDG VTEG LPEEQKKTMADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPLNLPRR 


7114 


3 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETLKDESGQBCKICRKI 
IYLNTDFVSVKQRLPKYYSWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWKVCLHYNIjHKAQPAERFFDPNQRGKALHQKQALRKSQRS 
QIGEKLYKCTEC^KVFIQKANLVVHQRTHTGEKPYECCECaKAF 
SQKSTL IAHQRTH7GEKPYE CSECX3KTPIQKSTL IKHQRTHTGE 
KPFVCDKCPKAFKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RIHTSEKPQCS8HGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRGKSHLSVHQRIHTGBKPYECSICGKTFSGKSHLSVHHRTHTG 
EKP YBCRRCGKAFG EKSTX* I VHQRMKTGEKP YKCNE CGKAFS E K 
S PL I KHQRIHTGERPY ECTDCKKAFSRKSTLIKHQR I HTGEKP Y 

VPCPrV^T^a*rCVT^OT^»T*»/UiroT'UTV^T7V'DVl?r , DIV , «V'"& DCOr/CTT *r 

m-o CiV».v?ivrtT o v iva l LiX VrLHK i n. IvifMxlrlJSUXCUI.uJUUrauKoTlil 
KHQRSHTGDKNL 


7115 


1 " 


947 


NAAHGYNWGLWCMYI I PPQDWLDRGDESAP IRTPAMIGCSFWD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPGVDFGDVSERLAIiRQRIjKCRSFKWYLENVYPEMRVYNNTIiT 
YG E VRNS KAS A YCLDQGAE DGDRAI L Y PCHGMSSQLVR YSADG L 
LQLGPLGSTAFLPDSKCIiVDDGTGRMPTLKKCEDVARPTQRLWD 
PTQSGPIVSRATGRCLEVEMSKDANFGLRLWQRCSGQKWMIRN 
WIKHARH 


7116 


866 


95 


rvrmrrnaevieeklsmkswakfrpgepwkgypnidpetdpyvt 
pgs vinnls intvrevdhlrdrnsgs s s slhttlpsts awss ir 
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SEQ 
ID 

Mrt » 
m\j : 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

vaa4 /Ilia 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C^Cysteine, D«Aspartic Acid, e« 
Glutamic Acid, F« Phenyl alanine , G=Glycine, 
H=>Histidine, I*>Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R=>Arginine, 
S=Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








ASNYNVPLS S TAQSTS ARNSDS KLTWSPGS VTNTSLAHELWKVP 
LPPKNITAP8RPPPGLTGQKPPLSTWDNSPLRIGGGWGNSDARY 
TPGSSWGBSSSGRITNWLVLKNLTPQIDGSTLRTLCMQHGPLIT 
FHLNLPHGNALVRYSSXEE WKAQKSLHI SDLFLLTL 


7117 


695 


1261 


LI» I S TPGGCH P P PS S I E FT YTGAWGKALP APHMPCAPGALP QGA" 
FVSQAARAIPLLQPSQAAQAEGLSQPARACGALCSLPWPIiRNWG 
S P I LRLPGGLRTPTNDRKTRTRSAMACWARAQWDTLGPIjKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHCE PNPGAG AM VLLHVLFEHAVGY ALLALKE VEE I S LLQPQVE 
ESVLNLGKFHS I VRLVAFCPFAS SQVALENANAVSEGWKEDLR 
LLI^THLPSKKKKVLIXSVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRGVRIJiFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMI IQS ISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRIiAQFIGNRRELNEDKLEKLEEIjTMDGAKAKAlLDASRSSMG 
MD I S A I DLIN I ES FSSRWS hS E YRQS IjHTYLRS KMS Q VAPSLS 
AL I GEAVGARL IAHAGS LTNLAKY PAS TVQ I LG AEKALFRALKT 
RGNTPKYGLI FHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKKAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAAIJ\IASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTS I P KRKXSTPKE ETVND PE EAGHRS GS KKKRK 
FSKEEPVSSGPEEAAGKS3SKKKKKFHKASQED 


7119 


49 


1863 


PHCE PN PGAGA1WLLHVLFEHAVG YAIiLAIjKE V EEISLLQ PQVE "" 
ESVUffLGKFHSIVRLVAFCPFASSQVALENANAVSEGWHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQSELGYNCQTGGVIAEI 
LRGVRLHFHNLVXGLTDLSACKAQIX3LGHSYSRAKVKFNVNRVD 
NMI IQSISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MD I S AIDLINI ES FSSRWSLSEYRQS LHTYLRS KMSQVAPS LS 
ALIGEAVGARLIAHAGSLTNLAJKYPASTVQIUSAEKALFRAIjKT 
RGNTPKYGLI FHSTFIGRAAAKNKGR I SRYLANKCS 1 ASR 1 DCF 
SEVPTSVFGEKLREQVKERLSFYETGEI prknldvmkeamvqae 
eaaaeitrklekqekxrlkkekkrijwualassenssstpeece 
emsekpkkkkkqkpqevpqengmedps isfs kpkkkksfskeel 

MSSDLEETAGSTS IPKRKKSTPKEBTVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFLVTNLE PRFIEPQTANLSWFKDS 
NSTTPLIFVLSPGTDPAADIiYKFAEEMKFSKKLSAISLGQGQGP 
RAEAMMRSS IERGKWVFFQNCHLAPSWMPALERLIEHINPDKVH 
RDFRLWLTSLPSNKFPVS I LQNGSKMTI EPPRGVRANLLKS YSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNI PY 
EFTDGDLRICISQLKMFLDEYDDIPYKVLKVTAGEINYGGRVTD 
DWDRRCIMNILEDFYNPDVI^PEHSYSASGIYHQIPPTYDLHGY 

T.^VTTf ^T<PT.MnMDRT1W5T.WnM^MTTI7ArtMn"i , lf IvT t y»HT T/^rv%r>v 
«w * J> ROurLiwunr e» X r o JunjJfJ rtJN X X r Ay 1MB i cHLtLAj 1 1 X\ji->U Fa, 

SS S AGS QGREE I VED VTQN I LLKVPE P I NLQ WVMAKYP VL YEES 
KN TVLVQEVI RYNRLLQ VI TQTLQDLL KALKG L WMS S QLE LMA 
ASLYl^TVPELWSAKAYPSLKPLSSWVMDLLQRLDFLQAWIQDG 
IPAVFWISGFFFPQAFLTGTliQNFARKFVISIDTISFDFKVMFE 
APSELTQRPQVG CY IHGLFLEGARWDPEAFQLAES QPKELYTEM 
AVI WLL PTPNRKAQDQDF YLCP I YKTLTRAGTLS TTGHSTNYVT 
AVE I PTHQPQRHW I KRGVAL I CALDY 


7121 


2 


546 


RPLRPW VltSIXJSMVGLMT YGRRQFQSLDTTMRRL I P P FREASAK 
LTTLVDADAEAFT AY LEAMRL P KNT PEE KDRRTAALQEGLRRAV 
S VP LTLAETVAS LW P ALQ ELARCGNIACR SD LQ VAAKALE MG VF 
GAYFNVL INLRDI TD EAFKDQ IHHR VS ShhQ E AKTQAALVLD CL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first . 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«»Alanine, C»Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
ij&iieucine , MsMetnlonine , N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S«Serine, T=Threonine, V»Valine, 
WaTryptophan, Y»Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 
ETRQE 


7122 


2 


546 


RPLRPWVLSLGSMVGLMTYGRRQFQSLDTTMRfeLIPPPfelSjSSAK" 
LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 
S VPLTLAETVASLW P ALQE LAR CGNLACRS D LQ VAAKAL EMG VF 

GAYFNVLINLRDITDEAFKDQIHHRVSSLIiQEAKTQAALVIiDCL 
ETRQE 


7123 




1 A QO 


KPAVPEARSAGTSEAGRSGAEEVSCGSVSGDGAAMRLTPRALCS 
AAQAAWRENFPLCGRDVARWFPGHMAKGLKKMQSSUCLVDCI I E 
VHDARIPLSGRNPLFOETI^LKPHIJ,VLNKm>LADLTEOjQKIMQ 
HLEGEGLKNVIFTNCVKDENVKQI I PMVTELIGRSHRYHRKENL 
E YC IM VIG VPNVGKSSL I NSLRRQHLR KGKATRVGGE PG ITRAV 
MSKI QVSERPLMFLLDTPG VLAPR I ES VETGLKLALCGTVLDHL 
VGBETMADYLI.YTLNKHQRPGYVQHYGLGSACDNVERVLKSVAV 
KIX5KTQKVKVLTGTGNVNVIQPNYPAAARDFLQTFRRGLLGSVM 
LDLDVLRGHPRV 


7124 


2 


3B2 


t,PLTLLLAAPFAHtLLPPGkDQSPCWHPGPALSPGTUSPLSWAM 
ANSGLQLLGY FLALGG WVGI IASTALPQWKQS S YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 


16$ 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRXLCGSSESRGVNESHKSB 
FIBLRKWLKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMI IS 
LPESCLLT\RDTVIRSYLGAYITKWKPPPSPLLALCTFLVSEKH 
AGHRS LLEA\ YLE I LPKAYTCPVCLEPEVVNLLP KSLKAKAEEQ 
RAHVQEFFASSRDFFSSLQPLFAEAVDSIFSYSALLHAWCTVNT 
RAVYL\SPGSGNAFLQSRTPVQLAPYLDLIiNHS PHVQVKAAFNE 
ETHS YEIRTTSRWRKHEEVFI CYGPHDNQRLFLEYGFVSVHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFI VPS PARRCSQ KGSLGHLPTQPWLWAAMS PRGQERGT" 
SHSQAREPQRPGR?njIiGSLQSSPGTLGQAGTASRRRGCMVQRWV 
QVATGRRAVQVPKGALGLALGETSPGASRGMSGGAGGCWALGMA 
PS PVL P SWLLEGPP PWLS 1 1 SDSGTQRPS PRRC PARP S P WG PQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR*IEKRPFXEI*RRIPRIF ' " 
AKQKQI *S*NSQKIGASEIDRGRKEADCSDAPAAARIGAVSVFR 
RSTQEARVS PRSKAKSANLRAVRAD* WEHFVLLFHTPEQFIAEC 
ICRST* *K*WHQLC*PLSSL*TGLKRKLLL*VLFRI *WLKDCDV 
♦FOQKI FATNFCNWQNLIQ* BE* KPVEYSVEN*HIMNLLLPM*L 
CQSS LRDQT I VTWRM * RNYSMFRINM I SSL* DGS I H I PLKLHF Y 
PALI FTLTVPINS CCQRPLPLFAHQS I KTLASSQS PMLACLRFL 
LVKKRAF IHTPRS PGCS V* CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALRELSQIEAELNKHWRRLLEGLSYYKPPSP"" 
SSAEKVKANKDVAS PLKELGLRISKFLGLDEEQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LTYFQDE RHPYRVE YADCVD KLEKELVS KYRQQFEEL YICTEAPT 
WETHGNLMTER0VSRWFVOCLREOSMIiT.ETTPT.VVAYPi7MJvocn 
LLVLTKMFKEQGFGSRQTNRHLVDETMDP FVDR IG YFSAL ILVE 
GMDIESLHKCAIiDDRRELHQFAQDGLICQDMDCLMLTFGDIPHH 
AP VLIAWALLRHTLNPEETS S VVRKIGGTAIQLNVFQ YLTRLLQ 
SLASGGNDCTTSTACMCVYGLLS FVLTS LELHTLGNQQD 1 1 DTA 
CE VLADPSLPBLFWGTEPTSGLGI ILDSVCGMFPHLLS PLLQLL 
RAL VSGKS TAKK VYS FLDKMS FYNEL YKHKPHDVTSHEDGTLWR 
RQTPKLLYPLGGQTNLRIPQGTVGQVMLDDRAYLVRWEYSYSSW 
TLFTCE I EMLLHWSTADVI QHCQRVKP I IDLVHKVI STDLSIA 
Z3CLLPITS RI YMLIiQRLTTVIS PPVDVIASCVNCLTVLAARNPA 
KVWTDIiRHTGFLPFVAHPVSSLSQMISAEGMNAGGYGNLLMNSE 
QPQGEYGVTIAFLRLITTLVKGQLGSTQSQGLVPCVMFVLKEML 
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SEQ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicteol end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(ArAlanine, C=Cysteine, DoAspartic Acid, E» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, Ielsoleucine, K=Lysine, 
L=I>eucine, M=Methionine, N»Asparagine , 
ps Proline, QaGlut amine, R=Arginine, 
S=Serine, ToThreonine, V=*Valine, 
W=Tryptophan, Y^Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PSYHKWRYNSHGVREQIGCLILELIHAILNLCHETDLHSSHTPS 
LQFLC I CSLAYTEAGQ WINI MGIGVDTII)MVMAAQPJISDGAEG 
QGQGQLLI KTVKLAFS VTNNVIRLKPPSNWSPLEQALSQHGAH 
GNNL I AVLAKYI YHKHDPALPRLA1QLLKRIATVAPMS VYACLG 
NDAAAI RDAFLTRLQS K\ I E \ DMR IK\ VM 1 L\ EFLT VAYVETQ P 
GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWEIjIDSQQ 
QDRYWCPPLUHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
S PLFGTLSPPS ETSE PS I LE TCAL 1MKI I CLEI Y YWKGS LDQ P 
LKDTLKKFSIEKRFAYWSGYVKSLAVHVAETEGSSCTSLLEYQM 
IjVSAWRMLL 1 I ATTHAD I MHIiTDS WRRQL FLDVLDGTKALLL V 
PASVNCLRLGSMKCTLLLILLRQWKRELGSVDEILGPLTEILEG 
VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDIPQYSQLVLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RDGVCVLGLHLAKELCEVDEDGDS WLQVTRRLP ILPTLLTTLEV 
S LRMKQNIiHFTE ATLHLLLTLARTQQGATAVAGAGI TQS 1 CLPL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSIiMEQLLKT 
LRYNFLPEALDFVGVHQERTLQCIiNAVR'TVQSLACLBEADHTVG 
FILQLSNFMKEWHFHLPQLfMRDIQVNLGYLOQACTSFLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASEQQALHTVQYGLLKILSKTI*AAjLRHFTPDVCQ ILLDQSLDJjA 
EYNFLFALSFTTPTFDSEVAPSFGTIiIiATVNVALNMLGELDKKK 

epltqavglstqaegtrtlksllmftmencfyllisqamrylrd 
pavhprdkqrmkcelsselstllsslsryfrrgapsspatgvlp 
spqgkstslskaspesqepliqlvqafvrhmqr 




1 


1054 


frrfrwrrrlh *agpassaggs pgeas gtm& &elp pn inikepr 
wdqstfigranhfftvtdprnilltneqlesarkivhdyrqgiv 
p pgltenelwrakyi ydsafhpdtgekmi ligrmsaqvpmnmti 
tgcmmtfyrttpavlfwqwinqsfnavvnytnrsgdapltvnel 

GTAYVSATTGAVATALGLNALTKHVSPLIGRFVPFAAVAAANCI 
NrPLMRQRELKVGIPVTDENGNRLGESANAAKQAlTQVWSRIL 
MAAftsMAl FP Fx MNTIiEKKAFLKRFP WMSAPIQVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGIi 


7130 | 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSLGRKG 
I &AAi>gp x HR£> QSSSS VL INKSMDS INYPSDVGKQQLLSLHRSS 
RCES HQDLLPDI ADSHQQGTE KLS DLTLQDS QKVWVNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDjWCSGNTLHSLNSPRTPKKPVNSKLGLS p yltp 
YNDS DKLNDYLWRGPS PNQQN I VQSLREKFQCLSS S S FA 


7131 


805 


573 


aaaeghiewkflieackvnpfakdrwgniplddavqfnhlew"" 
kllqdyqdsytlsetqaeaaaealskenlesmv 


7132 


1420 


1087 


idmlllsgalvsgpytlittavsadlgthkslkgnahalstvta 
i iixstgsvgaai^plliagllspsgwsnwymlmfadacallfli 
rlihkelscpgsatgdqvpfkeq 


7133 


2 


3648 


OQIPGLLPAHGESGDAIiRKPRIiQKPITGHIiDDLFFTLYPSLEKF 
EEELLELHVQDHFQEGCGPLDGGALEILERRLRVGVHNGIiGFVQ 
RPQVWLVPEMDVALTRSASFSRKWSSSKTSSGSQALVLRSRL 
RLPEMVGHPAFAVI FQjLE YVFS S PAGVDGNAAS VTS LSNLACMH 

mvrwavwnplleadsgrvtlplqggiqpnpshclvykvpsasms 
seevkqvesgtlrfqfslgseehldaptepvsgpkverrpsrkp 
ptspssppapvprviiaapqnspvgpglsisqlaasprsptqhcl 
arptsqlphgsqaspaqaqefpleagishleadlsqtslvlets 
iaeqi^elpftplhapivvgtqtrssagqpsrasmvllqssgfp 
eildankqpaeavsatepvtfnpqkeesdclqsnemvlqflafs 
rvaqdcrgtswpktvyftfqfyrfppattprlqlvqldeagqps 

SGALTH I LVP VSRDGTFDAGSPGFQIiR ymvgpg flkpgerrcfa 
RYLAVQTLQIDVWDGDSLLLIGSAAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C* Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F»Phenylalanine, GsGlycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=l*eucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine , VsValine, 
w=Tryptophan, Y«Tyrosine, X»Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LEWATEYEQDNMWSGDMLGFGRVKPIGVHSWKGRLHLTLAN 
VGHPCEQKVRGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHW QAQKLADVDS E LAAMLLTHARQGKGP QD VSR BS DATRRR K 
LERMRS VRLQEAGGDLGRRGTS VLAQQSVRTQHLRDLQVIAAYR 
ERTKAES I AS LLSLAI TTEHTLHATIX3 VABFFEFVLKNPHNTQH 
TVTVEIDNPELSVIVDSQBWRDPKGAAGLHTPVEEDMFHLRGSL 
APQLYLRPHETAHVPPKPQSFSAGQLAMVQASPGLSNEKGMDAV 
SPWKSSAVPTKHAKVLPRASGGKPIAVLCLTVEIjQPHWDQVPR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMU3EDPPVHVRCSDPNV 

icbtqnvgpgeprdiflkvasgpspeikdffvi^ysdrwiatpt 
qtwqvyi^slqrvdvscvagqltrlslvlrgtqtvrkvraftsh 
pqelktdpkgvfvlpprgvqdlhvgvrplragsrfvhlnlvdvd 

CHQLVAS WLVCLCCRQPI* I SKAFE I MLAAGBGKG VNKR I TYTNP 

YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGIjQFAPSQRV 
GEEEILI YINDHEDKNEEAFCVKVI YQ 


7134 


2115 


1111 


ggegfsypphvglslgtpldphyvllevhydnptyeegLidnsg^ 
lrlfytmdirkydagvieaglwvsi^htippgmpefqseghctl 
ecleealeaekpsgihvpavllhahlagrgirlrhprkgkemkl 
laydddfdfnfqefqylkeeqtilpgdulitbcryntkdraemt 
wgglstrsemclsyllyyprinltrcas I PDI meqlqfigvkei 
yrpvttwppiikspkqyknlsfmdamnkfkwtkkeglsfnklvl 
slpvnvrcsktdnaewsicsgmtalppdierpykaeplvcgtsss 
sslhrdfs inllvclllls ctl5tksl 


7135 


2 


2072 


FVPRVTPRSLSLQGPKGBSVGSITQPLjpSSVtlFRAASESDGfeC ' 

WLDALELALRCSSLLRLGTCKPGRDGEPGTSPDASPSSLCGLPA 

SATVHPDQDLFPLNGSSIiENDAFSDKSERENPEESDTETQDHSR 

KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 

ENKSLMWTLLKQLRPGMDLSRWLFTFVLEPRS FLNKLSDYYYH 

ADLLSRAAVEEDAYSRMKliVLRWYLSGFYKKPKGIKKPYNPILG 

ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 

GSITAKSRFYGNSLSALLDGKATLTFLNRAEDYTLTMPYAHCKG 

ILYGTMTLELGGKVTIECAKNNFQAQLEFKLKPFFGGSTSINQI 

SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 

QR LRQHT VPIiEEQTELESERLWQHVTRAI SKGDQHRATQEKFAL 

EEAQRQRARERQESLMPWKPQIiFHLDPITQEWHYRYEDHSPWDP 

LKDIAQFEQDGILRTLQQEAVARQTTFLGS PGPRHERSGPDQRL 

RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 

PRCKKEARRLO^HEAILSIREAQaBLHRHLSAMLSSTARAAQA 

PTPGLLQS PRSWFLLCVFLACQLFINHILK 


7136 


2 


418 


ue vtr&s KKKSUW ibyxvwjjUKAATijEKEVAGLREKIHHLDDMLK 
SQQRKVRQMIEQLGNS KAVIQS KDATIQEUCEKIAYLBAEKLEM 
HDRMEHL I BKQISHGNFSTQARAKTENPGS IRI SKPPS PKPMPV 
IRWET 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 

PDDUJGNTNKRSKEVRVLQEMQLLQVAAMNYRLRPLEKFV^YFT 
RMEQLSDKESYKLSCQLEPENP 


713B 


2 


46* 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA " 
GSFKVATOERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDDLDGNTNKRSKBVRVLQEMQLLQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKES YKLS CQLEPENP 


7139 


1 


357 


SLRNSARGLKMAASAARGAAALRRSIKQPVAFVRRIPWTAASSQ 
LKEHFAQFGHVRRCIL PFDKETGFHRGLG WVQFS S EEGLRNALO 
QENHI I DG VKVQ VHTRRP KLPQTS DDE KKD F 


7140 


i4bi 


1357 


RASSLQVLKAWGGLIPSSFQOX2HTGQYALEEI/FDLKVYDCFCSF 
NMNVSLEKQLRPSQPWPRGKCRKTPGWEEARPKAQDLRGDLGKT 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicced end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nuuuo o*-Au beg men t containing signal peptide 
(A=*Alanine, C=>Cysteine, D«=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G»Glycine, 
H-iHistidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
PaProline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, * D stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 
WTPKGQDP PLM FS EDYOKSLLEOYHliGIiDOKT .w irvwr i?r t uktt? 
ADFMTNQCG 


7141 


124 


1073 


L.DSRSCWLDMEDLBEDVRFIVDETLDFGGLSPSDSREEEDITVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEEILD 
EANRliAAOLEOCALODRESAGPGTJ^PRWWDODDowriyirr vnon 

VRDLLPTVNSLTRSTPS/LKQPDASTPE* * *EGVSQGSPGYI WK 

EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 

AASEETRAAKLRGAAAKSSCQLPIPSAIPRPASRMPLTSRSVPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7142 


658 


839 


ItTPT.Mr.WMRT.IfMT.qg\/ f rr.UTPi\PT.VKtT/-<T tr it/no <>vr ttiav 

aj* i*j*M*Aiijjivi T i*jj>o v x nnxivftr jux V» 1 v_JjAjr i oCLiIr QNVXiNLIj 
KK * SRAVGWWM CRT/YSSDLQVGVI KPWLLLGS QDAAHDIiDT 
LKKNKVTHILNVAYGVENAFLSDFTYKSISILDLPETNILSYFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 


SLEMSSIXSEPI^I^DSEDSISSTIMDVDSTISSGRSTPAMMNGQ"' 
GSTTSSS KNIAYNCCWDQCQACFNSSPDLADHIRS IHVDGQRGG 
VFVCLWKGCKVYOTPSTSQS WLQRHMLTHSGDKP FKCVVGGCNA 
SFASQGGIARHVPTHFSQQNSSKVSSQPKAKEESPSKAGMNKRR 
KLKNKRRRSIiARPHDFFDAQTLDAIRHRAICFNLSAHlESLGKG 
HS WFHS TVS I LLFFQ IKYKTLQKNIST 1 1SKS LK 1 


7144 


1 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
RCPAPRPAGVSWIRDEVEKYNHNGVNALQLDPALNPXFTAGRD 
S 1 1 RI WS VNQHKQDP YIASMEHHTDWVND IVLCCNGKTLI S ASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKALAYAKDKELVASAGLDR 
Q I FLWDVNTLTALTAS NNTVTTS SLSGNKDS I YSLAMNQLGT 1 1 
VSGS TEKVLR WDPRTCAKLMKLKGHTDNVKAIiLLNRDGTQCLS 
GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKIYCTDLRNPDIRVLICE 



TRADOCS:1416260.l(%CSK0II.DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:M786 and 
3573-5358, and complementary sequences thereof 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim I. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one ofSEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 1 0. 

-13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the • 
polynucleotide of claim 1 is detected. 



14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

2 1 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-1 786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the anay detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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This international report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 

I. Q Claim Nos.: 

because they relate to subject matter not required to be searched by this Authority, namely: 



Claim Nos.: 

because they relate to pans of the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out, specifically: 



3. Q Claim Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 
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This International Searching Authority found multiple inventions in this international application, as follows: 
This includes 4 invention Groups and 3572 sequence species 



IX] As all required additional search fees were timely paid by the applicant, this international search report covers all 
searchable claims. 

I I As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite 
payment of any additional fee. 

I I As only some of the required additional search fees were timely paid by the applicant, this international search report 
covers only those claims for which fees were paid, specifically claims Nos.: 
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This application contains the following inventions or groups of inventions which are not so linked as to form a single 
inventive concept under PCT Rule 13.1. In order for all inventions to be searched, the appropriate additional serch fees must be paid. 
Group I, claims Ml, 13-16, and 19-26. drawn to nucleic acid molecules, vector molecules and host cells containing said nucleic 
acids, polypeptides, methods of making said polypeptides and method of detection using said nucleic acids and polypeptides. 
Group II, claim 12 and 28, drawn to antibodies and method of treatment using composition comprising said antibodies. 
Group III, claims 17-18, drawn to methods of indentifying a binding partner to a polypeptides. 
Group IV, claim 27, drawn to method of treatment using composition comprising polypeptides. 

The inventions listed as Groups I-IV do not relate to a single inventive concept under PCT Rule 13. 1 because, udner PCT Rule 13.2. 
they lack the same or corresponding special technical features for the following reasons: Group I encompasses nucleic acids, 
polypeptides expressed thereby, vectors and host cells containg same, respectively, and methods of making as well as the first method 
of use of this jubject matter. Groups II-V all are directed to different special technical features as summarized as follows: Group II is 
directed to an antibody and method of treatment using same, which antibody undergoes recognition and binding reactions wherein 
what is bound is different from what is bound by the compositions of Group I. For example, the polypeptides of Group I do not bind 
the polypeptides of Group I as the antibody of Group II does. Identification of binding partner and treatment are clearly different 
special technical features from detection. Group III is directed to the identification of a binding partner of a polypeptide, which is not 
identified in any of the other Groups and thus clearly contains its own special technical feature. Group IV is directed to treatment, 
which is a clearly different methods than the methods in the other Groups.Thus, in summary, each of Groups I-IV are directed to 
different special technical features and thus support this lack of unity. 

Additionally, each of the claims is directed to more than one species of the generic invention. These species are deemed to lack unity 
of invention because they are not so linked as to form a single inventive concept under PCT Rule 13. 1 . In order for more than one 
species to be searched, the appropriate additional search fees must be paid. The species are as follows: The claims include a series of 
polynucleotides and the polypeptides encoded thereby as representde by the sequences of SEQ ID Nos: 1-1786, and 3573-5358. Each 
of these polynucleotide sequences encodes a separate polypeptide and thus represent a separate gene. Therefore, each of these genes 
defines its own special technical feature. In summary, one species is a gene represented by one polynucleotide sequence and one 
polypeptide sequence encoded thereby. 
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